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Abstract 

We establish conditions on sequences of graphs which ensure that the mixing times of 
the random walks on the graphs in the sequence converge. The main assumption is that the 
graphs, associated measures and heat kernels converge in a suitable Gromov-Hausdorff sense. 
With this result we are able to establish the convergence of the mixing times on the largest 
component of the Erdos-Renyi random graph in the critical window, sharpening previous 
results for this random graph model. Our results also enable us to establish convergence in 
a number of other examples, such as finitely ramified fractal graphs, Galton- Watson trees 
and the range of a high-dimensional random walk. 

1 Introduction 

The geometric and analytic properties of random graphs have been the subject of much recent 
research. One strand of this development has been to examine sequences of random subgraphs 
of vertex transitive graphs that are, in some sense, at or near criticality. A key example is the 
percolation model and, for bond percolation above the upper critical dimension, we expect to 
see mean-field behavior in the sequence of finite graphs in the critical window. That is, the 
natural scaling exponents for the volume and diameter of the graph and for the mixing time are 
of the same order as those for the Erdos-Renyi random graph in the critical window, as given 
in [35] . 

This mean-field behavior is seen in other natural models of sequences of critical random 
graphs. For example [j6J obtained general conditions for the geometric properties of percolation 
clusters on sequences of finite graphs and discussed examples such as the high dimensional torus 
and the n-cube, while the random walk on critical percolation clusters on the high-dimensional 
torus is treated in [23]. Motivated by these results we will focus on the asymptotic behavior of 
mixing times for random walks on sequences of finite graphs. We consider general sequences of 
graphs but under some strong conditions which will enable us to establish the convergence of 
the mixing time. 

In order to demonstrate our main result we consider the Erdos-Renyi random graph. Let 
G(N,p) be the random subgraph of the complete graph on N labeled vertices {1, . . . , N} in 
which each edge is present with probability p independently of the other edges. It is a classical 
result that if we set p = c/N, then as N — > oo, if c > 1 there is a giant component containing a 
positive fraction of the vertices, while for c < 1 the largest component is of size log N. However, 
ifp = jV-i + AiV- 4 / 3 for some A £ f , we are in the so-called critical window, and it is known that 
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the largest connected component C N , is of order iV 2 / 3 . The recent work of [T] has shown that 
the scaling limit of the graph, Ai, exists and can be constructed from the continuum random 
tree. 

For the Erdos-Renyi random graph above criticality, [18] and [I] established mixing time 
bounds for the simple random walk on the giant component. The simple random walk on this 
graph is the discrete time Markov chain with transition probabilities determined by p(x, y) = 
l/deg(x) for all y such that (x,y) is an edge in C N . For the random graph in the critical 
window, the following result on the mixing time ^^(C^) (a precise definition will be given 
later in (|1.8p . see also Remark 1.3) of the lazy random walk (a version of the simple random 
walk which remains at its current vertex with probability 1/2, otherwise it moves as the simple 
random walk) was obtained by Nachmias and Peres ( \35\ Theorem 1.1]). 

Theorem 1.1. Let C N be the largest connected component of G(N, (1 + \N~ 1 / 3 )/N) for some 
A 6 R. Then, for any e > 0, there exists A = A(e, A) < oo such that for all large N , 

P(4 ix (C iV )^[A- 1 iV,AiV])< e . 

It is natural to ask for more refined results on the behavior of the family of mixing times. 
The purpose of this paper is to give a general criteria for the convergence of mixing times for 
a sequence of simple random walks on finite graphs in the setting where the graphs can be 
embedded nicely in a compact metric space. Due to the recent work of [1] and [9] we can apply 
our main result to the case of the Erdos-Renyi random graph, to obtain the following result. 

Theorem 1.2. Fix p S [l,oo]. Ift^ x (p N ) is the LP -mixing time of the simple random walk on 
C N started from its root p , then 

in distribution, where the random variable vL- (p) E (0,oo) is the L p -mixing time of the Brow- 
nian motion on Ai started from p. 

We will later illustrate our main result with a number of other examples of random walks 
on sequences of finite graphs. In order to state it, though, we start by describing the general 
framework in which we work. Firstly, let (i* 1 , dp) be a compact metric space and let tt be 
a non- atomic Borel probability measure on F with full support. We will assume that balls 
Bp(x,r) := {y £ F : dp(x,y) < r} are 7r-continuity sets (i.e. ir(dBF(x,r)) = for every 
x G F, r > 0). Secondly, take X F = (Xf) t >o to be a 7r-symmetric Hunt process on F (for 
definition and properties see [19J), which will typically be the Brownian motion on the limit of 
the sequence of graphs. We suppose the following: 

• X F is conservative, i.e. its semigroup (Pt)t>o satisfies i^l = 1, 7r-a.e., Vt > 0, (1-1) 

• there exists a jointly continuous transition density {qt(x,y)) x ,yeF,t>o of X F , (1.2) 

• for every x,y £ F and t > 0, qt(x,y) > 0, (1-3) 

• for every x £ F and t > 0, qt(x,-) is not identically equal to 1, (1-4) 

where conditions (|1.3p and (|l,4p are assumed to exclude various trivial cases, and by transition 
density we mean the kernel qt(x,y) such that 

E x [f(X t F )) = [ q t (x,y)f(y)iT(dy), 
Jf 
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for all bounded continuous function / on F. Furthermore, we will say that the transition density 
(qt( x ,y))x,y£F,t>o converges to stationarity in an LP sense for some p G [1, oo] if it holds that 



lim DJx,t) = 0, (1.5) 

t— >oo 



for every x G F, where D p (x, t) := \\qt(x, ■) — 1\\lp(w)- If this previous condition is satisfied, then 
it is possible to check that the L p -mixing time of F, 



t p mix (F) := inf it > : sup D p (x, t) < 1/4 , (1.6) 

is a finite quantity (see Section [3]). Finally, note that ^^(F) < t p mi ^{F) for p < p', which can 
easily be shown using the Holder inequality. 

We continue by introducing some general notation for graphs and their associated random 
walks. First, fix G = (V(G), E(G)) to be a finite connected graph with at least two vertices, 
where V{G) denotes the vertex set and E{G) the edge set of G, and suppose do is a metric on 
V{G). In some examples, da will be a rescaled version of the usual shortest path graph distance, 
by which we mean that dc(x,y) is some multiple of the number of edges in the shortest path 
from x to y in G, but this is not always the most convenient choice. Define a symmetric weight 
function yP : V(G) 2 — >• M + that satisfies (i~!y > if and only if {x,y} G E(G). The discrete 
time random walk on the weighted graph G is then the Markov chain ((X G ) m >o, P^f, x G 
V(G)) with transition probabilities (Pq(x, y)) x ,yeV(G) defined by Pc(x,y) := p^y/Ki where 
/4f := Ylty£V{G) ^xy- ^ we define a measure tt g on V(G) by setting, for A C V(G), tt g (A) := 
SzeA l^x I J2xeV(G) Ki then ir G is the invariant probability measure for X G . The transition 
density of X G , with respect to 7r , is given by (p^x, y)) x ,yeV{G),m>0i where 

p G (x y) = F * {Xm = V) 

PmK^V)- TT G ({y}) ' 

Due to parity concerns for bipartite graphs, we will consider a smoothed version of this function 
(<lm(x,y))x,y£V(G),m>o obtained by setting 

&*,V)-.= Pg{X ' y) \ Pg+liX ' y \ (1.7) 
and define the L p -mixing time of G by 

Cix( G ) := inf < m > : sup D G {x, m) < 1/4 I , (1.8) 
I xdV(G) I 



where D G (x,m) := Wq^x, •) — IH^p^g). Finally, in the case that we are considering a sequence 
of graphs (G n )n>i, we will usually abbreviate tt gN to ir N and q° N to q N , etc. 

Remark 1.3. In [35J, the mixing time of C N is defined in terms of the total variation distance, 
that is 

T mix (C N ) = min{t : \\P t (x, •) - vr(-)|| T v < 1/8, Vx G V(C N )}, (1.9) 

where Pt(x,B) = Yly^sP? ( x ^y) 7r (y) f° r B C V(C N ), p^{x^y) is the transition density for the 
random walk and — v\\tv = msx BcV(C N ) IM-^) — v{B)\ for probability measures /i, v on 
V{C N ). (To be precise, 1/8 in (|1.9p is 1/4 in [35], but this only affects the constants in the 
results.) However, noting that 

IIm-HItv = 2 S MM) ~ v (i x })\> 

xev(c N ) 
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(see, for example [Ml Proposition 4.2]), one sees that T m ; y (C ) = tJ Qix (C Ar ). Also note that [35] 
considers the lazy walk on the graph to avoid parity issues, but the same techniques will apply 
to the mixing time defined in terms of the smoothed heat kernel introduced at (|1.7|) . 

We are now ready to state the assumption under which we are able to prove the convergence 
of mixing times for the random walks on a sequence of graphs. This captures the idea that, 
when suitably rescaled, the discrete state spaces, invariant measures and transition densities of 
a sequence of graphs converge to (F,dp), n and (qt{%,y))x,yeF,t>0, respectively. Its formulation 
involves a spectral Gromov-Hausdorff topology, the definition of which is postponed until Section 
[2J and a useful sufficient condition for it will be given in Proposition 12.41 below. Note that we 
extend the definition of the discrete transition densities on graphs to all positive times by linear 
interpolation of {q^{x, y)) m >o for each pair of vertices x, y G V(G). Note also that the extended 
transition densities are different from those of continuous time Markov chains. 

Assumption 1. (G n )n>i is a sequence of finite connected graphs with at least two vertices for 
which there exists a sequence (7(iV))]v>i such that, for any compact interval I C (0,oo), 

f(y(G N ), d GN ) , n N , (q% )t (x, y)) x ^ v{GN) ^)j -+ W d F ) , («*(*, !/) W,tel) 
in a spectral Gromov-Hausdorff sense. 

In the case where we have random graphs, we will typically assume that we have the above 
convergence holding in distribution. Our main conclusion is then the following. 

Theorem 1.4. Suppose that Assumption^ is satisfied. If p G [l,oo] is such that the transition 
density (qt(%,y))x,yeF,t>o converges to stationarity in an LP sense, then t^^F) £ (0, oo) and 

7(AT ^ Ci*(n (1-10) 

In Section 13. 2\ we will explain how to derive a variation of Theorem 11.41 that concerns the 
convergence of mixing times of processes started at a distinguished point in the state space. 

We emphasize that a key part of our paper is to verify Assumption [T] and apply Theorem 
11.41 in various interesting examples (including the Erdos-Renyi random graphs in the critical 
window as mentioned above). Therefore, we devote considerable space to applying our results 
to such examples. 

The organization of the paper is as follows. In Section 2, we give a precise definition of 
the spectral Gromov-Hausdorff convergence and give some of its basic properties. In Section 3, 
we prove Theorem 11.41 and derive a variation of the theorem for distinguished starting points. 
Some sufficient conditions for (|1.1|) - (|1.5|) are given in Section 4. A selection of examples where 
the assumptions of Theorem 11.41 can be verified, and hence we have convergence of the mixing 
time sequence, are given in Section 5. In Section 6 we introduce some geometric conditions 
on graphs for upper and lower bounds on the mixing times for the corresponding symmetric 
Markov chains. We use these ideas to derive tail estimates of mixing times on random graphs 
in the case of the continuum random tree and the Erdos-Renyi random graph. The proofs of 
these results can be found in the Appendix. 

2 Spectral Gromov-Hausdorff convergence 

The aim of this section is to define a spectral Gromov-Hausdorff distance on triples consisting of 
a metric space, a measure and a heat kernel-type function that will allow us to make Assumption 
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[T] precise. We will also derive an equivalent characterization of this assumption that will be 
applied in the subsequent section when proving our mixing time convergence result, and present 
a sufficient condition for Assumption [T] that will be useful when it comes to checking it in 
examples. Note that we do not need to assume (|1.3p . (|1.4p in this section, and only use (jl.ip 
to deduce Proposition 12.41 from a result of [H] . 

First, for a compact interval / C (0,oo), let Aii be the collection of triples of the form 
(F,ir,q), where F = (F,dp) is a non-empty compact metric space, tt is a Borel probability 
measure on F and q = (q t (x,y)) Xjy< =Fj£i is a jointly continuous real-valued function of (t,x,y). 
We say two elements, (F,TT,q) and (F' , tt' , q'), of Aii are equivalent if there exists an isometry 
/ : F — > F' such that tt o f^ 1 = tt' and q[ o / = q t for every t G I, by which we mean 
Qt(f( x )i f(v)) = Qt(x, y) for every x,y G F, t G I. Define Aii to be the set of equivalence classes 
of Aii under this relation. We will often abuse notation and identify an equivalence class in 
Aii with a particular element of it. Now, set 



where the infimum is taken over all metric spaces Z = (Z, dz), isometric embeddings <p : F —> Z, 
4>' : F' — > Z, and correspondences C between F and F' , dfj is the Hausdorff distance between 
compact subsets of Z, and dp is the Prohorov distance between Borel probability measures on 
Z. Note that, by a correspondence C between F and F' , we mean a subset of F x F' such that 
for every x G F there exists at least one x' G F' such that (x, x') G C and conversely for every 
x' G F' there exists at least one x G F such that (x, x') G C. 

In the following lemma, we check that the above definition gives us a metric and that the 
corresponding space is separable. (The latter fact will be useful when it comes to making con- 
vergence in distribution statements regarding the mixing times of sequences of random graphs, 
as is done in Sections 15.21 and 15.31 for example). Before this, however, let us make a few remarks 
about the inspiration for the distance in question. In the infimum characterizing A/, the first 
term is simply that used in the standard Gromov-Hausdorff distance (see Definition 7.3.10], 
for example). In fact, as far as the topology is considered, this term could have been omitted 
since it is absorbed by the other terms in the expression, but we find that it is technically 
convenient and somewhat instructive to maintain it. The second term is that considered by the 
authors of |22j in defining their 'Gromov-Prohorov' distance between metric measure spaces. 
The final term is closely related to one used in j!6l Section 6] when defining a distance be- 
tween spatial trees - real trees equipped with a continuous function. Indeed, the notion of a 
correspondence is quite standard in the Gromov-Hausdorff setting as a way to relate two com- 
pact metric spaces. One can, for example, alternatively define the Gromov-Hausdorff distance 
between compact metric spaces as half the infimum of the distortion of the correspondences 
between them (see Theorem 7.3.25]). 

Lemma 2.1. For any compact interval I C (0, oo), (Aii, A/) is a separable metric space. 

Proof. Fix a compact interval / C (0, oo). That A/ is a non- negative function and is symmetric 
is obvious. To prove that it is also the case that Aj ((F, tt, q), (F',7r',q')) < oo for any choice 
of (F,TT,q), (F' , tt' , q') G Aii, simply consider Z to be the disjoint union of F and F' ', setting 
dz(x, x') := diam(F, dp) + diam(F', d' F ) for any x G F, x' G F' , and suppose that C = F x F' . 



Aj ((F,TT,q),(F',TT',q')) 




+ sup (d z {(t){x)A'{x l ))+dz{(t){y)A'(y'))+^\q t {x,y)-q' t {x',y')\ 
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We next show that A/ is positive definite. Suppose (F, it, q), {F' , it', q') E Aij are such that 
Aj ((F, it, q), (F',n', q')) = 0. For every e > 0, we can thus choose Z, $,(/>' ,C such that the sum 
of quantities in the defining infimum of Aj is bounded above by e. Moreover, there exists a 
5 £ (0, e] such that 

sup sup\q t (xi,yi) - q t (x 2 ,y 2 )\ <£■ (2.1) 

d F {x 1 ,x 2 ),dF(yi,y2)<5 

Now, let (xi) ( j*i 1 be a dense sequence of disjoint elements of F (in the case F is finite, we suppose 
that the sequence terminates after having listed all of the elements of F) . By the compactness of 
F, there exists an integer N £ such that (B F (xi, S)),^ is a cover for F. Define A\ := Bp(x\, 5), 
and Ai := B F (xi,5)\ U*-~* B F (xi,S) for z = 2, . . . ,iV e , so that (Ai)^ is a disjoint cover of F, 
and then consider a function f £ :F^F' obtained by setting 

fe{x) := X- 

on A4, where x\ is chosen such that {x{, x^) S C for each i = 1, . . . , N £ . Clearly, by definition, f e 
is a measurable function. It is further the case that it satisfies, for any i£F, 

dzU>(x), cf>'(fe(x))) < d z (<Kx), 4>{x{)) + dziftxi), ^{x'i)) < 2e, 

where, in the above, we assume that % £ {1, . . . ,N e } is such that x £ Ai. From this, it readily 
follows that: 

sup \d F (x,y)-d F ,(Ux)J £ (y))\<Ae (2.2) 

x,y€F 

and 

d^' (ir of' 1 , ir') < 3e, (2.3) 
where dp is the Prohorov distance on F' . By applying (|2.ip . we also have that 

sup \q t (x,y)-q' t (f £ (x),f £ (y))\<2e. (2.4) 

x,y£F,teI 

To continue, we use a diagonalization argument to deduce the existence of a sequence (e ra )ri.>i 
such that fe n {xi) converges to some limit f{xi) S F' for every i > 1. From (|2.2[l . we obtain that 
dp'{f{xi), f(xj)) = dp(xi,Xj) for every i, j > 1, and so we can extend the map / continuously to 
the whole of F (0 Proposition 1.5.9]). This construction immediately implies that / is distance 
preserving. Moreover, reversing the roles of F and F', we are able to find a distance preserving 
map from F' to F. Hence / must be an isometry. To check that (F,n,q) and (F',ir',q') are 
equivalent, it therefore remains to check that ir o f^ 1 = n' and q't° f = qt for every t € I. Fix 
e > and recall that the definition of means that it is an e-net for F. Let e' G (0, e] be 

such that dp'ifs'ixi), f(xi)) < £ for every i = 1, . . . , N £ . Then, 

d F/ {f £/ {x),f{x)) < d FI {f £l {x), f £l {xi)) + d FI {f £l {x t ), f{ Xi )) + d FI {f{ Xi ), f(x)) < 7e, (2.5) 

where we are again assuming that i £ {1, . . . , N £ } is such that x E Ai, and have applied (|2.2p 
and the distance-preserving property of /. In particular, this implies that 

4'(tt o rV) < d F P \-K of- 1 , no + 4'(7r o /-^tt') < lOe, 

where we use (|2.3p to deduce the second inequality. Since e > was arbitrary, this yields that 
it o /-I = vr'. Finally, flM} and ([23]) imply that 

sup |g t (aj,y) - q' t (f(x),f{y))\ < 2e + sup sup \q' t {x[, y[) - qt{x' 2 ,y' 2 )\ , 

x,y£F,teI x',y 2 ,t;l^eF': te/ 
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and so q' t o f = q t for every t & I follows from the continuity properties of q' . This completes the 
proof of the fact that: if Aj ((F,n,q), (F' , it' , q')) = 0, then the triples (F,ir,q) and (F',Tr',q r ) 
are equivalent in the sense described at the start of the section. Consequently, Aj is indeed 
positive definite on the set of equivalence classes Mi. 

For the triangle inequality, we closely follow the proof of \22\ Lemma 5.2] . Let , 7i"W ; qW ) 
be an element of X/, i = 1,2,3. Suppose that A/((fW, tt^, qW), (F^, vr^ , q^)) < 5 U so 
that we can find a metric space Zi, isometric embeddings (f>\ \ : F^ — > Z\ and 2 ,i : F^ — > Z\ 
and correspondence C\ between F^ and F^ such that the sum of quantities in the defining 
infimum of A/ is bounded above by tf a . If A/((F( 2 ), tt^ , qW), (f( 3 ), vr( 3 ), q^)) < e) 2 , we define 
Z2 ,<p2,2, ^3,2; C2 in an analogous way. Now, set Z to be the disjoint union of Z\ and Z2, and 
define a distance on it by setting dz\z l xz l = for i = 1,2, and for x G Zi, y G Z 2 , 

dz{x,y):= inf fe^^i^ + d/jfeW,!/)). 

Abusing notation slightly, it is then the case that, after points separated by a distance have 
been identified, (Z,dz) is a metric space into which there is a natural isometric embedding 0, 
of Zj, i = 1,2. In this space, we have that 

< ^(^(FW),^!^ 2 ))) +4 2 (0 2 , 2 (F (2) )^3, 2 (F (3) )), 

where we have applied the fact that 0i(0 2 ,i(y)) = 2 (0 2 , 2 (?/)) for every y G F^, and so 
(^2,1 (F^ 2 ')) = 02 (02,2 (i* 1 ^)) as subsets of Z. A similar bound applies to the embedded 
measures. Now, let 

C := {(x,z) G x F (3) : 3y G F (2) such that G Ci,(y,z) G C 2 }, 

then if (x, z) G C, 

^(0l(01,l(x)),0 2 (03,2(z))) < d Zl (^l,l(x), 2) 1 ($/)) + ^2 (02,2 (y), 03,2 (z)), 

where y G i 7 ^ 2 ) is chosen such that (x,y) G C\ and (y, G C 2 , and we again note 0i(0 2 ,i(y)) = 
02 (02,2 (y))- Proceeding in the same fashion, one can deduce a corresponding bound involving 
qW } i = 1,2,3. Putting these pieces together, it is elementary to deduce that 

A/((F (1) ,vr (1) ,g (1) ), (F (3) ,7r (3) ,<7 (3) )) <5x + 5 2 , 

and the triangle inequality follows. Thus we have proved that (At/, A/) is a metric space. 

To complete the proof, we only need to show separability. This is straightforward, however, 
as for any element of Aii, one can construct an approximating sequence that incorporates only: 
metric spaces with a finite number of points and rational distances between them, probability 
measures on these with a rational mass at each point, and functions that are defined (at each 
coordinate pair) to be equal to rational values at a finite collection of rational time points and 
are linear between these. To be more explicit, let (F, ir, q) be an element of Aii, and then define 
a sequence (F , ir N , q N )N>i as follows. First, let F N be a finite iV _1 -net of F, which exists 
because F is compact. By perturbing dp, it is possible to define a metric d F N on F N such that 
\d F N(x,y) — dp{x,y)\ < N^ 1 and moreover d F N(x,y) G Q for all x,y G F N . Now, since F N is 
an iV _1 -net of F, it is possible to choose a partition (A x ) x€F n of F such that x G A x and the 
diameter of A x (with respect to dp) is no greater than 27V -1 . Moreover, it is possible to choose 
the partition in such a way that A x is measurable for each x G F N . We construct a probability 
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measure on F N by choosing n N ({x}) G Q such that \tt n ({x}) - tt(A x )\ < N' 1 (subject to the 
constraint that ^2 x£ p N tt n ({x}) = 1). Finally, define en by setting 

e N := sup sup \q s (x, y) - q t (x', y')\ , 

s,tEl: x,x' ,y,y'eF: 

so that, by the joint continuity of q, em — > as N — > oo. Let inf I < to < t\ < ■ ■ ■ < tx < sup/ 
be a set of rational times such that |io — inf I\, | sup I— tjc\, U\ — N^ 1 , choose q^ (x, y) G Q 

such that \qp (x,y) — < N^ 1 for each x,y G F N , and then extend q N to have domain 

F N x x / by linear interpolation in t at each pair of vertices. This construction readily yields 
that A/((F, 7r, q), (F N ,tt n ,q N )) < GN^ 1 + 3en — > 0. Since the class of triples from which the 
approximating sequence is chosen is clearly countable, this completes the proof of separability. 
□ 



We will say that a sequence in Aij converges in a spectral Gromov-Hausdorff sense if it 
converges to a limit in this space with respect to the metric Aj. We note that in the frame- 
work of compact Riemannian manifolds, different but related notions of spectral distances were 
introduced by Berard, Besson and Gallot ([5]) and by Kasue and Kumura ([21]). Moreover, by 
applying our characterization of spectral Gromov-Hausdorff convergence, we are able to deduce 
that if Assumption [1] holds, then we can isometrically embed all the rescaled graphs, measures 
and transition densities upon them into a common metric space (E,cIe) so that they converge to 
the relevant limit objects in a more standard way, as the following lemma makes precise. Note 
that in the proof of the result and henceforth we define balls in the space (E,cIe) by setting 
B E (x,r) := {x G E : d E (x,y) < r}. 

Lemma 2.2. Suppose that Assumption [7] is satisfied. For any compact interval I C (0, oo), 
there exist isometric embeddings of (V(G N ),d G N), N > 1, and (F,dp) into a common metric 
space {E,d E ) such that 

(2.6) 



and also, 



lim d%(V(G»),F) = 0, 

iV->oo 

lim d%(n N ,ir) = 0, 

N— »oo 



lim sup sup q~ (N)t (gN(x),gN(y)) - qt(x,y) 



0. 



(2.7) 



(2. 



where, for brevity, we have identified the spaces (Y(G N ),d GN ) } N > 1, and (F, dp), and the 
measures upon them with their isometric embeddings in (E,d E )- For each x G F, we define 
gN{x) to be a vertex in V(G N ) minimizing d E (x,y) over y G V(G N ). 

Proof. Fix a compact interval I C (0,oo). By Assumption [TJ for each N > 1 it is possible 
to find metric spaces (EN,dw), isometric embeddings 4>n : (V(G N ),d G N) — > (EN,djy), <fi' N : 
(F,d E ) — > (En,(1n) and correspondences Cn between V(G N ) and F such that, identifying the 
original objects and their embeddings, 



d% (V(G I "),F) + df > K,vr) 

+ sup id N (x,x') + d N (y,y') + sup q5 N)t (x,y) -q t (x',y') ) < e^v, (2.9) 

(x,x'),(y,y')eC N V tei 1 
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where en — > 0. Now, proceeding similarly to the proof of the triangle inequality in Lemma 
12. H set E to be the disjoint union of E N , N > 1, and define a distance on it by setting 
dslsNxEN = d N for N > 1, and for x G x 7 G S^', TV / N', set 

d E (x,x') := inf (d N (x,y) + d N >(y,x')) . 

y£F 

Quotienting out points that are separated by distance results in a metric space (E, d E ) (again, 
this is a slight abuse of notation), into which we have natural isometric embeddings of the metric 
spaces (V(G N ),d G N), N > 1, and (F,d E ). Moreover, in the metric space (E,d E ), it readily 
follows from (j2.9j) that the relevant isometrically embedded objects satisfy (|2.6p and (|2.7p . To 
prove ()2.8p . first note that for every x G V(G^), > 1, there exists an x' G -F such that 
(x,x') G Cat. This implies that d E (x,x') < e^v, and so, for any 5 > 0, 



N 



sup sup 

x,j / ,2ey(G JV ): *£/ 

^ G iv(j/,z)<<5 

< 2£at + sup sup \q t (x, y) - q t (x,z)\ . 

x,y,zeF: tel 
dp(y,z)<6+2eN 



(2.10) 



Now, for every iff and N > 1, there exists an x' G V(G'' V ) such that (x',x) G Cat, and so 
dE(x',x) < sn- Therefore, since <?at(x) is the closest vertex of V(G N ) to x, 

g N (x) £ B E (x,2e N )nV(G N ) C B E (x',3e N )nV(G N ) = B v(GN) (x',3e N ). 

Consequently, 

Q^(N)t(9N{x),gN{y)) -q t (x,y) 



sup sup 

x,yeF tei 



N 



< £n + 2 sup sup 

x,y,zeV(G N ): tel 
d G N(y,z)<3e N 

< 5e N + 2 sup sup \q t (x, y) - q t (x,z)\, 

x,y,zeF: tel 
d F {y,z)<5e N 



where the second inequality is an application of (|2.10p . Letting N — > oo and applying the joint 
continuity of (qt(x,y)) x ,yeF,t>a> we obtain the desired result. □ 



For our later convenience, let us note a useful tightness condition for the rescaled transition 
densities that was essentially established in the proof of the previous result. 



Lemma 2.3. Suppose that Assumption^ holds. For any compact interval I C (0, oo), 



lim lim sup sup sup 

<5^0 jv^oo Xj y )Z <z V (G N ): tel 

d G N (y,z)<S 



N 



0. 



(2.11) 



Proof. Recalling the continuity property of q, taking the limit as N — > oo in (|2.10p yields 



lim sup sup sup 

N^oo x ,y,zeV(G N ): tel 

d G N {y,z)<5 



?7(iV)t( x ) y) ~ ^{N)ti x ^ z ) 



N 



< sup sup \q t (x, y) - q t (x,z)\ 

x,y,zeF: tel 
dF(y,z)<$ 



Again appealing to the continuity of q, the right-hand side here converges to as 5 — > 0, which 
completes the proof. □ 
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It is straightforward to reverse the conclusions of the previous two lemmas to check that if 
(|2.6p . (|2.7|) . (|2.8p and (|2.1ip hold, then so does Assumption!]] Indeed, under these assumptions, 
we have isometric embeddings of (V(G N ),d G N), N > 1, and (F,cIf) into a common metric 
space (E, d E ) for which: (|2.6p gives the Hausdorff convergence of sets; (|2.7p gives the Prohorov 
convergence of measures; and moreover, it is elementary to check from (|2.8p and (|2.1ip that, 
with respect to the correspondences 

C N := {{x,x') £ F x V(G N ) : d E (x,x') < iV" 1 } , 

the relevant transition densities converge uniformly, as described in the definition of the metric 
A/. Thus, in examples, it will suffice to check these equivalent conditions when seeking to verify 
Assumption [TJ In fact, it is further possible to weaken these assumptions slightly by appealing 
to a local limit theorem from [14] . To be precise, because we are assuming that the transition 
densities of the graph satisfy the tightness condition of (|2.1ip . we can apply |14t Theorem 15], to 
replace the local convergence statement of (|2.8p with a central limit- type convergence statement. 
Note that, although in [14] it was assumed that the metric on G N was a shortest path graph 
distance, exactly the same argument yields the corresponding conclusion in our setting, and so 
we simply state the result. 

Proposition 2.4 (cf. O Theorem 15]). Suppose that (V(G N ),d GN ), N > 1, and (F,d F ) can 
be isometrically embedded into a common metric space {E,d E ) in such a way that 12. 6\) and 
{2. 7| ) are both satisfied. Moreover, assume that there exists a dense subset F* of F such that, 
for any compact interval I C (0,oo), x G F* , y £ F, r > 0, 

i&^Sw G B B (y,rj) = Pf [X? G B E (y,r)) (2.12) 

uniformly for t G I, and also \2.11\) holds. Then Assumption 1 holds. 

To complete this section, let us observe that [14] also provides two ways to check ([2. lip : 
one involving a resistance estimate on the graphs in the sequence (O Proposition 17]), and 
one involving the parabolic Harnack inequality ([14, Proposition 16]). Since the first of these 
two methods will be applied in several of our examples later, let us recall the result here. 
To allow us to state the result, we define R G N(x,y) to be the resistance between x and y in 
V(G N ) (see (|Q]> ). when we suppose that G is an electrical network with conductances of 
edges being given by the weight function yP . This defines a metric on V(G N ), for which the 
following result is proved as [21 Proposition 17]. As above, note that although it was a shortest 
path graph distance considered in [T4], the same proof applies for a general distance on the 
graph in question. Moreover, the statement of the lemma is slightly different from that of the 
corresponding result in [TJj, because there the scaling a(n) was absorbed into the definition of 
the metric. 

Lemma 2.5 (cf. [TJ1 Proposition 17]). Suppose that there exists a sequence (a(iV))jv>i and 
constants k, 01,02,03 S (0, 00) such that 

R G N(x,y)< Cl (a(N)d GN (x,y)r, \/x,yeV(G N ), 

and also 

C2l{N) < a(N) K P(N) < c 3l (N), 
where /3(N) := J2x,yeV(G N ) V% > then H2.ll}) holds. 
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3 Convergence of ZAmixing times 



3.1 Proof of Theorem CEH 

In this subsection we prove the mixing time convergence result of Theorem 11.41 Throughout, 
we will suppose that Assumption [1] holds and that the graphs G N and limiting metric space F 
have been embedded into a common metric space (E, cLe) in the way described by Lemma [2.21 
Recall from the introduction the definition of D p (x,t) = \\qt(x, •) — IHlp^), the L p -distance 
from stationarity of the process X F started from x at time t. By applying the continuity of 
{qt(x,y)) x ,yeF,t>Oj compactness of F and finiteness of n, it is easy to check that this quantity 
is finite for every x £ F and t > 0. The next lemma collects together a number of other 
basic properties of D p (x,t) that we will apply later (the first part is a minor extension of [HI 
Proposition 3.1], in our setting). 

Lemma 3.1. Let p 6 [1, oo]. For every x £ F, the function t i— > D p (x,t) is continuous and 
strictly decreasing. Furthermore, we have 

lim D p (x,t) > 2. (3.1) 



Proof. That the function t i— > D p (x,t) is continuous is clear from (|1.2p . and so we turn to 
checking that it is strictly decreasing. First, a standard argument involving an application of 
Jensen's inequality and the invariance of n allows one to deduce that H-Ft/llz,?^) < ||/||lp(7t) 
for any / £ L p (F,tt), where (Pt)t>o is the semigroup naturally associated with the transition 
density (qt(x,y)) x ,yeF,t>o- Now, suppose / 6 L p (F,ir) is such that \\Ptf\\LP{-K) = II/IIl^tt), and 
define fi(y) := \Ptf(y)\ p and f2(y) '■= Pt(\f\ p )(y)- By the assumption on / and the fact that 
X F is conservative and 7r-symmetric, we have that 

/ hdTT = [ \P t f(y)\^(dy) 
Jf Jf 

= [ \f(y)\ p 7r(dy) 
Jf 

= [ \f(y)\ P [ qt(y,z)w(dz)n(dy) 
Jf Jf 

= 11 \f(y)\ p q t (z,yMdy)ir(dz) 
Jf Jf 

= [ P t (\f\P)(z)n(dz) 
Jf 

= [ / 2 *r- 
Jf 

Furthermore, Jensen's inequality implies fi(y) < f2(y)- Thus, it must be the case that f\(y) = 
f2(y), vr-a.e. In particular, because tt is a probability measure, there exists a y £ F such that 

h{y) = h{y)- 

In the case p > 1, the conclusion of the previous paragraph readily implies that / is constant 
qt(y, z)ir(dz)-&.e. Recalling the assumption that qt(y,z) > everywhere, namely (|1.3p . it must 
therefore hold that / is constant 7r-a.e. Observing that for s, t > we can write D p (x, s + 1) = 
\\P s (qt(x, •) — 1)||z,p( w ), it follows that D p (x,s + t) < D p (x,t) if and only if qt(x,-) = 1, 7r-a.e. 
However, condition (|1.4p and the assumption that the transition density is continuous imply 
that there exists a non-empty open set on which qt(x, •) ^ 1. Thus, because tt has full support, 
it is not the case that qt(x, •) = 1, 7r-a.e., and we must have D p (x, s + 1) < D p (x, t), as desired. 
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For p = 1, the result fi(y) = f2{y) implies that / is either non- negative or non-positive, 
7r-a.e. Consequently, if we suppose that D p (x, s + t) = \\P s (q t (x, •) — 1)||lp(V) = D p (x, t) for some 
s > 0, then it must be the case that qt(x, •) — 1 is either non-negative or non-positive. However, 
since J F (q t (x,y) - l)ir(dy) = (due to (HI])) and (H3D holds, we arrive at a contradiction. In 
particular, it must be the case that D p (x,s + t) < D p (x,t), and this completes the proof of 
strict monotonicity. 

To establish the limit in (|3.ip . it will suffice to prove the result in the case p = 1 (obtaining 
the result for other values of p is then simply Jensen's inequality). Let x G F and r > 0, then 

D x (x,t) > / (q t {x,y)-l)TT(dy)+ / (1 - q t (x, y))n(dy) 

JB E (x,r) JB E (x,r)c 

= 2P X (Xf G B E (x, r)) - 2tt(B e (x, r)), 

where (jl.ip is used in the last equality. Since X F is a Hunt process, the first term here converges 
to 2 as t — > 0. Furthermore, because ir is non-atomic, the second term can be made arbitrarily 
small by suitable choice of r. The result follows. □ 



We continue by defining the L p -mixing time at x € F by setting 

f mh[ (x) :=mf{t>0:D p (x,t)<l/A}. 

In fact, the previous lemma yields that t^- r (x) is the unique value of t £ (0,oo) such that 
D p (x,t) = 1/4 (when (|1.5p holds at x). Similarly, define the L p -mixing time of x G V(G N ) by 
setting 

C(x):=inf{t>0:<(x,t)<l/4}, 

where D p (x,m) = \\q^(x, ■) — lll^p^iv). That the discrete mixing times at a point converge 
when suitably rescaled to the continuous mixing time there is the conclusion of the following 
proposition. 

Proposition 3.2. Suppose that Assumption^ is satisfied. If p G [1, oo] is such that M.5\) holds 
for x G F, then 

hm l(N)- 1 t^(g N (x))=C ix (x), 

N— >oo 

where, as in the statement of Lemma \2.2l <7at(x) is a vertex in V(G N ) that minimizes the 
distance dE(x,y) overV(G N ). 

Proof. Suppose p G [l,oo] is such that (|1.5p holds for x G F, set to := dix( x ) e (Oi 00 )) an d fi x 
e > 0. By (|1.2p and the tightness of Lemma 12.31 there exists a 6 > such that 

sup sup \\q t (x,y) - l\ p - \q t (x,z) - l\ p \ < e, (3.2) 

tel y,z£F: 

d E {y,z)<28 



lim sup sup sup 

A^oo tel y>z eV(G N ): 
d G N (y,z)<3S 



Q 7 (N)t(9N(x),y) -l\ p -\q^ {N)t (g N (x),z) -l\ p <e, (3.3) 



where / := [to/2,2to]- Moreover, by the compactness of F, there exists a finite collection of 
balls {B E {x t ,5)) k i=1 covering F. Define A 1 := B(x x ,28), and A { := B E {x t ,25)\ U}~\ B E ( Xi ,25) 
for i = 2, . . . , k, so that (Ai)^ =1 is a disjoint cover of the 5-enlargement of F. 
We observe 

\D p (x,t) p - Z^(^(x),7(A0t) p | < Ti +T 2 + T 3 +T 4 , 
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where 



2a 



f \q t (x, y) - l\ p ir(dy) - V \q t (x, a*) - 1\ p k{A 
Jf i=i 
k k 

Mx, Xi ) - l\**(Ai) - J2 \<h(x, Xi) - l\ p ir N {A 
i=l i=l 
k k 

Y,Mx,x t ) - l\ p K N (Ai) -J2K(N)t(9N(x),g N ( Xi )) - l\ p ir N (A 
i=i i=i 

k 



<V(G N ) 

Now, suppose t £ I. From (|3.2p . we immediately deduce that T\ < e. For T2, we first observe 
that the fact balls are 7r-continuity sets implies that Ai, . . . , A^ are also 7r-continuity sets. Hence 
7r (Ai) — > Tr(Ai) for each i = 1, . . . , k, and so T2 < e for large N. That T3 < e for large N is a 
straightforward consequence of LemmaEjg Finally, applying the fact that d§ (F, V(G N )) -)• 0, 
we deduce that, for large N, (Ai)f =x is a disjoint cover for V(G N ). Since gN(xi) G Be{xi,5) for 
large A r , we also have that d G N(y, gjv(^i)) — 3<5, uniformly over y G Ai, i = 1, . . . , fc. Thus we 
can appeal to (j3.3|) to deduce that it is also the case that T4 < e for large N. In fact, each of 
these bounds can be assumed to hold uniformly over t € I, thereby demonstrating that 



lim sup \ Dp(x, t) — D (g N (x),j(N)t)\ 



0. 



(3.4) 



Since 1 1— > Dp (gN(x),y(N)t) is a decreasing function in t for every N (cf. [SJ Proposition 3.1]) 
and 1 1 y D p (x,t) is strictly decreasing, the proposition follows. □ 



Remark 3.3. In the case p = 2, the proof of the previous result greatly simplifies. In particular, 
we note that 

D 2 (x,t) 2 = \\q t (x, ■) - = q 2 t(x,x) - 1, (3.5) 

and similarly 

D%{x n {N)tf = \\q% N)t (x,-) - = q^ N)t (x,x) - 1. 

Hence the limit at (|3.4p is an immediate consequence of the local limit result of (|2.8p . and we 
do not have to concern ourselves with estimating the relevant integrals directly. 

To extend the above proposition to the corresponding result for the mixing times of the 
entire spaces, we will appeal to the following lemma, which establishes a continuity property 
for the L p -mixing times from fixed starting points in the limiting space, and a related tightness 
property for the discrete approximations. 

Lemma 3.4. Suppose p G [1, 00] is such that M.5\) holds for x G F, then the following statements 
are true. 

(a) The function y h-> ^ ix (y) is continuous at x. 

(b) Under Assumption^ it is the case that 

lim lim sup sup t(-^0 
5^0 n^oo yeV(G N ): 

d G N {9n(x),v)<S 



0. 
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Proof. Consider p G [1, oo] such that (|1,5|) holds for x G F, so that to := ^mix( x ) ^ s finite, and let 
e G (0, to/2). Since the function t t-t D p (x,t) is strictly decreasing (by Lemma IBTTj) . there exists 
an 77 > such that D p (x, to — e) > D p (x, to) + rj = 1/4 + 77 and also D p (x, to + e) < 1/4 — 77. By 
the continuity of (qt(x,y)) x ,y<=F,t>o, there also exists a 8 > such that 

sup sup |Dp(»,t) - D p (y,t)\ < 77. 

te[to-s,to+e] J/GF: 

dp(a;,2/)<<5 

Hence if y G Bp(x,5), then 

£> P (y, *o - e) > D p (x, t - e) - 77 > -, 

t + e) < D p (x, t + e) + 77 < -. 

This implies that t^ ix (y) G [to — £>^o + £ ]> an d (a) follows. 

The proof of part (b) is similar. In particular, choose 77 as above and note that (|3.4p implies 
that DK(g N (x),j(N)(to - e)) > 1/4 + r?/2 and D»(g N (x), j(N)(t + e)) < 1/4 - 77/2 for large 
N. Furthermore, by the transition density tightness of Lemma 12.31 there exists a 8 > such 
that 

sup sup \D»(g N (x)MN)t) - (y,j(N)t)\ < |, 

te[t -e,t +£} yeV(G N ): ' Z 

d G N{9N(x),y)<S 

for large N. Hence if N is large and y G V(G N ) is such that d G N(gN(x),y) < 8, then 
D?(yMN)(to-e)) > V 4 , and 7 (JV)(t + e)) < 1/4. This implies that t^)- 1 ^^) G 

[to — £, to + e]. Since it is trivially true that, once N is large enough, this result can be applied 
with y = giy(x), the result follows. □ 



We are now ready to give the proof of our main result. 

Proof of Theorem \1.4\ Observe that, under the assumptions of the theorem, Lemma 13.4( a) 
implies that the function (t^ lix (a;)) a ;gir is continuous. Since F is compact, the supremum of 
(t 1 Lw(x))xeF is therefore finite. Now, it is an elementary exercise to check that we can write the 
L p -mixing time of F, as defined at (|1.6p . in the following way: 

tf nix (F) = supC ix (x). (3.6) 

Consequently t^ ix (F) G (0, 00), as desired. 

To complete the proof, we are required to demonstrate the convergence statement of (Jl.lOp . 
Fix e > 0. For every x G F, Proposition 13.21 and Lemma [3.4( b) allow us to choose 5(x) > and 
N(x) < 00 such that 



sup 

N>N(x) 



7(JV)-W))-U«) 



' < E. 



sup sup j(N) 1 t%?(g N (x)) -t%?(y) 

N>N(x) yeV(G N ): 

(9ivOr),y)<4<50) 

Since (Be(x,8(x))) X £F is an open cover for F, by compactness it admits a finite subcover 
(Be(x,8(x))) X £X- Moreover, because df I (F,V(G N )) — > 0, there exists an No > such that if 
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N > No, then (Be{x, 25{x))) x& x is a cover for V(G ). Applying this choice of X, we have for 
N > N V max Ig ^ iV(x) that 

7 (iV)- 1 C ix (G Ar ) < snp l(N)-H^l(g N (x)) + e < sup ^(s) + 2e < C ix (F) + 2s, 

where we note that, similarly to (|3.6p . the L p -mixing time of the graph G N can be written as 

Cix(G' JV )= sup e x (x). 

xGV(G JV ) 

Furthermore, if xq G F is chosen such that tmix( x o) > *mix(^ ? ) ~~ e > then, f° r large N, 

J(N)-H^(G N ) > ^N)-h^(g N (x )) > C ix (x ) - e > t^(F) - 2e, 
where we have again made use of Proposition 13.21 Since e > was arbitrary, we are done. □ 



3.2 Distinguished starting points 

In certain situations, convergence of transition densities might only be known with respect to a 
single distinguished starting point. This is the case, for instance, in two of the most important 
examples we present in Section [5] - critical Galton- Watson trees and the critical Erdos-Renyi 
random graph. In such settings, it is only possible to prove a convergence result for the mixing 
time from the distinguished point. It is the purpose of this subsection to present a precise 
conclusion of this kind. 

Consider, for a compact interval / C (0,oo), the space of triples of the form (F,ir,q), where 
F = (F, d,F ,p) is a non-empty compact metric space with distinguished vertex p, tt is a Borel 
probability measure on F and q = {qt{x,y)) Xjy& F,t&i is a jointly continuous real-valued function 
of (t,x,y); this is the same as the collection M.j defined in Section [2j though we have added 
the supposition that the metric spaces are pointed. We say two such elements, (F, tt, q) and 
(F', tt', q'), are equivalent if there exists an isometry / : F — > F' such that f(p) = p', vro/ -1 = tt' 
and q't ° f = Qt for every t € I. By following the proof of Lemma 12. li one can check that it 
is possible to define a metric on the equivalence classes of this relation by simply including in 
the definition of Aj the condition that the correspondence C must contain (p,p'). We define 
convergence in a spectral pointed Gromov-Hausdorff sense to be with respect to this metric. 
The distinguished starting point version of Assumption Q] is then as follows. 

Assumption 2. Let (G n )n>\ be a sequence of finite connected graphs with at least two vertices 
and one, p N say, distinguished, for which there exists a sequence (7(A / "))tv>i such that, for any 
compact interval I C (0,oo), 

( ( ^),wT^%^< ev(G „, le; ) 

converges to ((F,dF,p) ,tt, {qt{p, x)) x& F,t^i) in a spectral pointed Gromov-Hausdorff sense, where 
p is a distinguished point in F. 

The following result can then be proved in an almost identical fashion to Proposition 13. 2| 
simply replacing gN{x) by p N and x by p. In doing this, it is useful to note that if Assumption 
[T] is replaced by Assumption [21 then we are able to include in the conclusions of Lemma 12.21 
that p N converges to p in E. 

Theorem 3.5. Suppose that Assumption^ is satisfied. If p £ [l,oo] is such that \1.5\) holds 
for x = p, then 

7W"W -+ CM- 
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4 Convergence to stationarity of the transition density 



Before continuing to present example applications of the mixing time convergence results proved 
so far, we describe how to check the LP convergence to stationarity of the transition density of 
X F in the case when we have a spectral decomposition for it and a spectral gap. In the same 
setting, we will also explain how to check the non-triviality conditions on the transition density 
that were made in the introduction. 

Write the generator of the conservative Hunt process X F as —A, and suppose that A has 
a compact resolvent. Then there exists a complete orthonormal basis of L 2 (F,7r), ((fk)k>i say, 
such that A(/?fc = Xk^fk f° r all A; > 0, < Ao < Ai < . . . and lim^oo A^ = oo. By expanding as 
a Fourier series, we can consequently write the transition density of X F as 

Qt(x,y) = q t (x,z)<p k (z)Tr(dz) I <p k (y) 



^2P t F (f k {x)(f k (y) 

k>0 

^2e' Xkt ipk(x)ip k {y), 



k>0 

where (P t F )t>o is the associated semigroup, and the final equality holds as a simple consequence 
of the fact that f t (P t F (p k ) = -P t F Aip k = -X k P F ^k- Now by CO]), it holds that 1 = P t F l is 
in the domain of A. A standard argument thus yields Al = AP F 1 = —-^(Pfl) = 0, and so 
there is no loss of generality in presupposing that Ao = and ipo = 1 in this setting. The only 
additional assumption we make on the transition density (qt(x,y)) x ,yeF,t>o is that it is jointly 
continuous in (t,x,y) (i.e. (|1.2|) holds). 

Lemma 4.1. Suppose that the operator A has a compact resolvent, so that the above spectral 
decomposition holds. If there is a spectral gap, i.e. Ai > 0, then (qt(x,y)) x ,yeF,t>o converges to 
stationarity in an L p sense (namely (jl.5p holds) for any p G [l,oo]. 

Proof. Recall from (|3.5p that D2(x,t) 2 = q2t(x,x) — 1. Under the assumptions of the lemma, it 
follows that 

D 2 (x,t) 2 = Y,e- 2Xkt Mx) 2 ^0, (4.1) 

k>l 

as t — >■ oo, which completes the proof of the result for p = 2. To extend this to any p, we first 
use Cauchy-Schwarz to deduce 

(q t {x,y)-l) 2 = lY,e~ Xkt Mx)My) 

\k>l 



" " VkKU) 2 

k>l 

i)(q t (y,y)-i) 



Consequently, we have that 



-Doo(x,f) 2 = sup(q t (x,y) - I) 2 

y&F 

< (qt(x,x) - 1) sup(q t (y,y) 

y&F 

< D 2 {x,t/2) 2 snpD 00 (y,l) 

y€F 
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for any t > 1, where the second inequality involves an application of the monotonicity property 
proved as part of Lemma I3TT1 Now, by (|1.2p . the term sup yg ^ A»(y> 1) is a finite constant, and 
so combining the above bound with (|4.1j) implies that D OQ (x,t) < CD2(x,t/2) — > as t — > oo. 
The result for general p E [1, oo] is an immediate consequence of this. □ 

We now give a lemma that explains how to check conditions ()1.3|) and (jl.4p . 

Lemma 4.2. Suppose that the operator A has a compact resolvent and there is a spectral gap, 
then the conditions f j 1 . 3 j) and (|1.4|) are automatically satisfied. 

Proof. Firstly, assume that q t (x,y) = for some x,y E F, t > 0. If s E (0,i), then the 
Chapman-Kolmogorov equations yield = qt(x,y) = j F q s (x, z)qt~ s (z,y)ir(dz). Since ir has 
full support, using (|1.2|) . it follows that q s (x, z)q t - s (z,y) = for every z £ F. In particular, 
q s (x,y)q t ^ s (y,y) = 0. Noting that q t - s (y,y) = Dj(y,t/2) + 1 > 1, we deduce that q s (x,y) = 0. 
Now, define a function / : (0, oo) — > M + by setting /(s) := q s (x,y). Letting (A^)j>o represent 
the distinct eigenvalues of A, we can write 

f(s) = J2^e~ Ks , 

i>0 

where ai := ^2j-\ j= y_ ^Pj(x)^j(y). In fact, since Cauchy-Schwarz implies ^«>ol a * e_A ^l — 
(q s (x,x)q s (y,y)) 1 / 2 < oo, this series converges absolutely whenever s E (0,oo). Thus f(z) := 
J2i>o a i e ~ XiZ defines an analytic function on the whole half-plane 3f?(z) > 0. By our previous 
observation regarding q s (x, y), this analytic function is equal to on the set (0, t], and therefore 
it must be everywhere on > 0. However, this contradicts the fact that f(t) = qt(x, y) — > 1 
as t — > oo, which was proved in Lemma ETT1 Hence, qt{x, y) > for every x,y S F, t > 0. 

Secondly, suppose that qt{x,-) = 1 for some x E F and t > 0. Then 1 = q t (x,x) = 
1 + Yli>i ( / 3 i( x ) 2e ~ A4< ) an d so (fi(x) = for every i > 1. This implies that qt(x,x) = 1 for every 
t > 0. However, by following the proof of (|3,ip . one can deduce that 

lim(^(x,x) - 1) = limD%(x,t/2) > limDf(x,t/2) > 2, 

t— >0 t—>-0 t— >0 

and so the previous conclusion can not hold. Consequently, we have shown that qt(x, •) ^ 1 for 
any x £ F, t > 0, as desired. □ 

To summarize, the above results demonstrate that to verify all the conditions on the tran- 
sition density that are required to apply our mixing time convergence results, it will suffice to 
check that the conservative Hunt process X F has a jointly continuous transition density and 
the corresponding non-negative self-adjoint operator, A, has a compact resolvent and exhibits 
a spectral gap. As the following corollary explains, this is a particularly useful observation 
in the case that the Dirichlet form (£,F) associated with X F is a resistance form. A precise 
definition of such an object appears in [26} Definition 3.1], for example, but the key property is 
the finiteness of the corresponding resistance, i.e. 

R(x, y) := sup j l/(x j ( ~ f ^ )? : / € 8 (/,/)> o} 

is finite for any x, y E F. 

Corollary 4.3. Suppose that X F is a ir-symmetric Hunt process on F such that the associated 
Dirichlet form (£,F) is a resistance form, then (|l.ip - (|1.5p are automatically satisfied. 
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Proof. The fact that X F is conservative is clear since for a resistance form 1 G T and £(1, 1) = 0. 
That (|1.2p holds is proved in [26, Lemma 10.7]. Moreover, we can check that the non-negative 
operator corresponding to (£, J-) has a compact resolvent (see [26, Lemma 9.7] and [29, Theorem 
B.1.13]) and exhibits a spectral gap (this is an easy consequence of the fact that, for a resistance 
form, £(f,f) = if and only if / is constant). Thus, by Lemma 14.11 and Lemma 14.21 the 
transition density of X F also satisfies (jl.3p - (ll.5p , □ 



5 Examples 

The mixing time results of the previous sections have many applications. To begin with a 
particularly simple one, consider G N to be a discrete d-dimensional box of side-length N, i.e. 
vertex set {1,2,... , N} d and nearest neighbor connections. By applying classical results about 
the convergence of the simple random walk on this graph to Brownian motion on [0, l] d reflected 
at the boundary, Theorem 11.41 readily implies that the L p -mixing time of the simple random 
walk on {1,2, . . . ,N} d , when rescaled by iV -2 , converges to the L p -mixing time of the limit 
process for any p G [1, oo]. A similar result could be proved for the random walk on the discrete 
torus (Z/iVZ) . More interestingly, however, as we will now demonstrate, it is possible to apply 
our main results in a number of examples where the graphs, and sometimes limiting spaces, 
are random: self-similar fractal graphs with random weights, critical Galton- Watson trees, the 
critical Erdos-Renyi random graph, and the range of the random walk in high dimensions. For 
the second and third of these, we will in the next section go on to describe how the convergence 
in distribution of mixing times we establish can be applied to relate tail asymptotics for mixing 
time distributions of the discrete and continuous models. 

5.1 Self-similar fractal graphs with random weights 

Although the results we have proved apply more generally to self-similar fractal graphs (see 
below for some further comments on this point), to keep the presentation concise we restrict 
our attention here to graphs based on the classical Sierpinski gasket, the definition of which 
we now recall. Suppose Pi,P2,Ps are the vertices of an equilateral triangle in M 2 . Define the 
similitudes 

ipi(x) :=Pi + - Pl , i = 1,2,3. 

Since (V0?=i * s a family of contraction maps, there exists a unique non-empty compact set F 
such that F = uf =1 ipi(F) - this is the Sierpinski gasket. We will suppose oIf is the intrinsic 
shortest path metric on F defined in |27| . and note that this induces the same topology as the 
Euclidean metric. Moreover, we suppose it is the (In 3)/(ln 2)-Hausdorff measure on F with 
respect to the Euclidean metric, normalized to be a probability measure. This measure is non- 
atomic, has full support and satisfies n(dB(x,r)) = for every x G F, r > (see [HI Lemma 
25]). 

We now define a sequence of graphs (G n )n>o by setting 

3 

V(G N ) := (J v h .., jX (Vo). 

ii,...,ijv=l 

where V := {pi,P2,P3} and Vii...i„ := Ai ■ ■ ■ A n , and 

E(G N ) := {{ipi- L ..,i N (x),ipi 1 ...i N (y)} ■ x,y G V , x ^ y, h,. . . ,i N G {1,2,3}} . 



18 



We set d G N := dF\v(G N )xV(G N )i so that (V(G ), (I g n) converges to (F,dp) with respect to the 
Hausdorff distance between compact subsets of F. Weights (^) e eE(G N ),N>o wm be selected 
independently at random from a common distribution, which we assume is supported on an 
interval [ci,c^], where < c\ < C2 < oo. By the procedure described in the introduction, we 
define from these weights a sequence of random measures (t^ n )n>o on the vertex sets of our 
graphs in the sequence (G n )n>o- That ir N weakly converges to tt as Borel probability measures 
on F, almost-surely, can be checked by applying [14, Lemma 26]. 

To describe the scaling limit of the random walks associated with the random weights fi N , 
we appeal to the homogenization result of [30] . To describe this, we first introduce the Dirichlet 
form associated with the walk on the level N graph by setting, for / 6 R^( GJV ), 

£ N (fj)--= E E /^.., w (^.„ iw w(/(^--^(*))-/(^...* w (i/))) a . 

ii,...,ijv=l x,y£Vo,xj^y 

Let = (A£) 

x,yeVo,x^y be the collection of weights such that the associated random walk 
on G° is the trace of X gN onto Vo- It then follows from [30^ Theorem 3.4] that there exists a 
deterministic constant C £ (0, oo) such that 



lim E 

n— >oo 



5\ N N 

A-xy ~ C 



0, 



for any x, y € Vb, x ^ y. Now, suppose £ G is a quadratic form on M. V ( G ) which satisfies ()5.1 
with u 7 ^ , x , , s replaced by C in each summand, then define 



£(f, f) = lim^ I- £g (f\ v(G »),f\v{G»)) 



for / £ J, where J- is the subset of C(F, R) such that the right-hand side above exists and 
is finite. It is known that (£,F) is a local, regular Dirichlet form on L 2 (F,tt), which is also 
a resistance form (see [29], for example). Thus, by Corollary 14.31 the associated 7r-symmetric 
diffusion X , which (modulo the scaling constant C) is known as Brownian motion on the 
Sierpinski gasket, satisfies (|l.ip - (jl.5p . 

For the case of unbounded fractal graphs, a probabilistic version of (|2.12p was proved as [HI 
Proposition 30(i)] by applying the homogenization result for processes of [31] (cf. [30]). Since 
the Sierpinski gasket is a finitely ramified fractal, it is a relatively straightforward technical 
exercise to adapt this result to the compact case by considering a decomposition of the sample 
paths of the relevant processes into segments started at one of the outer corners of the gasket 
and stopped upon hitting another. 

To expand on this, we will explain how to prove a version of [31} Theorem 3.6] in our setting. 
(Note that our X G is a discrete time Markov chain with tt n as the invariant measure, whereas 
in [31] it was the continuous-time Markov chains with normalized counting measure as the 
invariant measure that were studied. However, since both measures are comparable and they 
converge to tt almost-surely, this difference can be easily resolved.) Recall p\ and p2 are two 
distinct elements of Vo- Let a^(X GN ) be the first hitting time of p\ by X° N , and for each 
i £ N, define inductively 

af 2 {X GN ) := ini{m>a^Hx GN ):X GN = P2 }, 
^(X GN ) := inf{m>a$(X GN ):Xg N = Pl }. 
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Then, we can write, for continuous / : F 



E 



X N 



f [ X 5 N t 



X N 

oo 



+ E 



f = < < 4? 



i=l 
oo 



1=1 



r(<+l) 
P2 



(5.2) 
(5.3) 

(5.4) 



where xtv £ V(G ) converges to x 6 F, say. The first summand in the right hand side of 
(|5.2p can be written in terms of the process X gN killed at p\, and so by tracing the proof of 
|141 Proposition 30(i)] line by line, we can check it converges to the corresponding expectation 
involving X F killed on hitting p\ . Similarly, the second summand in (15 .2|) can be written as 



E x N if( X 5 N t) 



^ < 5 



E 



G 
x N 



■pG 

<5»i} Pi 



G^ 



[/(* 



5"t- 



T (0))l {5 iv t _ 



.(0)^^(1) 



pi 



P2 



T (0)> 
PI 



I J 7 



r (0)]]> 

PI 



where is the shift map. Given a 



(0) 
Pi 



s, the strong Markov property allows us to write 



{5 JV t-S<cr 



(l)-lJ 
P2 J 



in terms of the process started at p\ and killed on hitting p2, 

independently of the distribution of opy. Thus the second term in the right hand side of (|5.2p 
converges to E F [f(Xf) : a^(X F ) < t < ap^(X F )]. We can prove convergence of the rest of 
the terms similarly. Moreover, by applying the estimate for the exit time of the random walks 
from balls stated as part of [14, Lemma 27], for example, it is straightforward to check that 



there exists a t > such that Pg" < 5 N t ) and P^frf < 5^%) are both bounded 
above by 1/2, uniformly in N. As a consequence of this, one can show that the terms in the 
sums at ()5.3p and (|5.4p decay exponentially, uniformly in N, and hence that the right hand side 
of (|5.3|) converges to E F [f(X F )] as N — > oo. Convergence of the finite dimensional distributions 
can be shown similarly and we obtain the desired version of |3H Theorem 3.6]. 

Finally, a probabilistic version of the tightness condition of (|2.11|) is easily checked by apply- 
ing (a probabilistic version of) Lemma l2.5l using known resistance estimates for nested fractals 
(cf. [TU Proposition 30(h)]), and so Assumption 1 holds in probability due to Proposition 12.41 
Thus we are able to apply Theorem 11.41 to deduce the following. 

Theorem 5.1. If t mix (G N ) is the mixing time of the random walk on the level N approxi- 
mation to the Sierpinski gasket equipped with uniformly bounded, independently and identically 
distributed random weights, then 



5 ^mix(C ) ^ ^mix(^ 1 ) 

C (F) is the mixing time of the diffusion X . 



in probability, where t E 



Let us remark that the same argument will yield at least two generalizations of this theo- 
rem. Firstly, it is not necessary for the weights to be independent and identically distributed, 
but rather it will be sufficient for them only to be 'cell independent', i.e. each collection 
(A 4 ^ i (x)i>i i (y))x,y£Vo,xj=y is independent and identically distributed as (fJ, X y)x,yeV ,x^y (We 
note that without a symmetry condition, though, the limiting diffusion will no longer be guaran- 
teed to be the Brownian motion on the Sierpinski gasket.) Secondly, the Sierpinski gasket is just 
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one example of a nested fractal. Identical arguments could be applied to obtain corresponding 
mixing time results for sequences of graphs based on any of the highly-symmetric fractals that 
come from this class (since the key references [2], [3D] and [3T] all incorporate nested fractals 
already) . 

Finally, variations on the above mixing time convergence result can also be established for 
examples along the lines of those appearing in \14\ Sections 7.4 and 7.5]. These include: an 
almost-sure statement for Vicsek set-type graphs (which complements the mixing time bounds 
for deterministic versions of these graphs proved in [21]); a convergence of mixing times for 
deterministic Sierpinski carpet graphs; and a subsequential limit for Sierpinski carpets with 
random weights. Since many of the ideas needed for these applications are similar to those 
discussed above, we omit the details. 



5.2 Critical Galton- Watson trees 

The connection between critical Galton- Watson processes and a-stable trees is now well-known, 
and so we will be brief in introducing it. Let £ be a mean 1 random variable whose distribution 
is aperiodic (not supported on a sub-lattice of Z). Furthermore, suppose that £ is in the domain 
of attraction of a stable law with index a G (1,2), by which we mean that there exists a sequence 
otv — > oo such that 

m^u= (5.5) 

a N 

in distribution, where £[N] is the sum of N independent copies of £ and the limit random 
variable satisfies E(e~ A ~) = e~ A " for A > 0. If Tn is a Galton- Watson tree with offspring 
distribution £ conditioned to have total progeny N, then it is the case that 

N^clnTn ->• T {a) , (5.6) 

in distribution with respect to the Gromov-Hausdorff distance between compact metric spaces, 
where T^ is an a-stable tree normalized to have total mass equal to 1 (see [S6\ Theorem 4.3], 
which is a corollary of a result originally proved in |15|). Note that the left-hand side here is 
shorthand for the metric space (V(Tn), N^^-aNd^), where V(Tn) is the vertex set of Tn and 
dj- N is the shortest path graph distance on this set. 

The a-stable tree T^ is almost-surely a compact metric space. Moreover, there is a natural 
non-atomic probability measure upon it, ir^ say, which has full support, and appears as the 
limit of the uniform measure on the approximating graph trees. Usefully, we can decompose 
this measure in terms of a collection of measures of level sets of the tree. More specifically, in 
the construction of the a-stable tree from an excursion we can naturally choose a root p G T^ . 
We define T {a) {r) := {x G T^ a ' : d<j-( a ) (p, x) = r} to be the collection of vertices at height r 
above this vertex. For almost-every realization of T^ , there then exists a cadlag sequence of 
finite measures on T^ a \ {i r ) r >o, such that £ r is supported on T^ a \r) for each r and 



7P 



(«) = / fdr 



o 



(see [T6j Section 4.2]). Clearly this implies that {dBj-( a ) (p, r)) = for every r > 0, for 
almost-every realization of T^ ■ Since a-stable trees satisfy a root-invariance property (see |16t 
Theorem 4.8]), one can easily extend this result to hold for 7r^-a.e. x G T^ ■ Although this 
is not quite the assumption of the introduction that tt^ (dB- T ( a )(x,r)) = for every x G T^ a \ 
r > 0, by a minor tweak of the proof of Proposition I3.2| we are still able to apply our mixing 
time convergence results in the same way. 
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Upon almost-every realization of the metric measure space (T^ a \ir^), it is possible to 
define a corresponding Brownian motion (to do this, apply [28| Theorem 5.4], in the way 
described in [1U|, Section 2.2]). This is a conservative ir^ -symmetric Hunt process, and the 
associated Dirichlet form (fw,^ a )) is actually a resistance form. Thus we can again apply 
Corollary 14.31 to confirm that (jl.ip - (jl.5p hold for some corresponding transition density, 
say. Now, in [13] , it was demonstrated that if is the law of the random walk on 7jv started 

from its root (original ancestor) p^ and tt n is its stationary probability measure, then, after 
embedding all the objects into an underlying Banach space in a suitably nice way, the conclusion 
of (|5.6|) can be extended to the distributional convergence of 



,J Ae[0,i] , 



to (T (o) , n( a \ P^ a) ), where P { p a) is the law of X^ started from p. By applying the fixed starting 
point version of the local limit result of Proposition 12.41 (cf. [14|, Theorem 1]), similarly to the 
argument of [14^ Section 7.2], for the Brownian continuum random tree, which corresponds 
to the case a = 2, one can obtain from this a distributional version of Assumption [2j (The 
tightness condition of (|2.1ip is easily checked by applying Lemma [23J) 

Lemma 5.2. For any compact interval I C (0, oo), 

{V(T N ),N- 1 a N d Tlft p tf ),* N , (q^ lt (p N ,x) 

\ 1V a N 1 J xeV(T N ),tei 

converges in distribution to ((T^ a \ d^-( a ) , p) , , (q[ a \p, x)) xe j-( a ) teI ) in a spectral pointed 
Gromov-Hausdorff sense. 

Consequently, since the space in which the above convergence in distribution occurs is sep- 
arable, we can use a Skorohod coupling argument to deduce from this and Theorem 13.51 the 
following mixing time convergence result. We remark that the y/2 that appears in the finite 
variance result is simply an artefact of the particular scaling we have described here, and could 
alternatively have been absorbed in the scaling of metrics. 

Theorem 5.3. Fix p E [l,oo]. Ift P nix (p N ) is the L p -mixing time of the random walk on T/v 
started from its root p N , then 

N~ 2 a N t p mix (p N )^t p mix (p), 

in distribution, where (p) E (0, oo) is the LP -mixing time of the Brownian motion on J~( a > 
started from p. In particular, in the case when the offspring distribution has finite variance a, 
it is the case that 



a 



x ^-^(p N )^ m M, 



in distribution. 



Remark 5.4. We note that it was only for convenience that the convergence of the random 
walks on the trees Tn, N > 1, to the Brownian motion on T^ a ' was proved from a single starting 
point in [13]. We do not anticipate any significant problems in extending this result to hold for 
arbitrary starting points. Indeed, the first step would be to make the obvious adaptations to 
the proof of [13^ Lemma 4.2] to extend the result, which demonstrates convergence of simple 
random walks (and related additive functionals) on subtrees of T/v consisting of a finite number 
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of branch segments to the corresponding continuous objects, from the case when all the random 
walks start from the root to an arbitrary starting point version. An argument identical to the 
remainder of \1'6\ Section 4] could then be used to obtain the convergence of simple random 
walks on the whole trees, at least in the case when the starting point of the diffusion is in one 
of the finite subtrees considered. Since the union of the finite subtrees is dense in the limiting 
space, we could subsequently use the heat kernel continuity properties to obtain the non-pointed 
spectral Gromov-Hausdorff version of Lemma 15.21 However, we do not pursue this approach 
here as it would require a substantial amount of space and new notation that is not relevant 
to the main ideas of this article. Were it to be checked, Theorem 11.41 would imply, for any 
p G [l,oo], the distributional convergence of ^^(Tn), the L p -mixing time of the random walk 
on T/v, when rescaled appropriately, to t p miyi {T^) G (0, oo), the IT'-mixing time of the Brownian 
motion on 

5.3 Critical Erdos-Renyi random graph 

Closely related to the random trees of the previous section is the Erdos-Renyi random graph at 
criticality. In particular, let G(N,p) be the random graph in which every edge of the complete 
graph on N labeled vertices {1, . . . , N} is present with probability p independently of the other 
edges. Supposing p = N^ 1 + AiV -4 / 3 for some A £ R, so that we are in the so-called critical 
window, it is known that the largest connected component C , equipped with its shortest path 
graph metric d C N , satisfies 

(v(C N ),N~ 1 / 3 d C N^ -> (M,(Im) 

in distribution, again with respect to the Gromov-Hausdorff distance between compact metric 
spaces, where (M.,cI_m) is a random compact metric space [lj. (In fact, this and all the results 
given in this subsection hold for a family of i-th largest connected components for all i € N. 
For notational simplicity, we only discuss the largest connected component C N .) Moreover, in 
[9], it was shown that the associated random walks started from a root vertex p N satisfy a 
distributional convergence result of the form 

where X M is a diffusion on the space Ai started from a distinguished vertex p G A4 . Although 
the invariant probability measures of the random walks, 7r say, were not considered in [9], it is 
not difficult to extend this result to include them since the hard work regarding their convergence 
has already been completed (see Lemma 6.3], in particular). Hence, by again applying the 
fixed starting point version of the local limit result of Proposition 12.41 (using Lemma 12.51 again 
to deduce the relevant tightness condition) , we are able to obtain the analogue of Lemma 15.21 
in this setting. 

Lemma 5.5. For any compact interval I C (0, oo), 

( (V(C% N-V% N , P N ) , vr^, (<lUp N ,*)) xeVi T N) , t el) < 

converges in distribution to ((A4, dyvb p), n , (q^ A (p, x))xeM,tei)j where ir M is the invariant 
probability measure of X M and (q^ /l (x, y)) x ,yeM,t>o is its transition density with respect to this 
measure, in a spectral pointed Gromov-Hausdorff sense. 

In order to proceed as above, we must of course check that ir M and q M satisfy a number 
of technical conditions. To do this, first observe that a typical realization of Ai looks like a 
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(rescaled) typical realization of the Brownian continuum random tree glued together at a 
finite number of pairs of points [TJ . Since ir M can be considered as the image of the canonical 
measure ir^ on under this gluing map, it is elementary to obtain from the statements of 
the previous section regarding tt^ that is almost-surely non-atomic, has full support and 
satisfies TT M (dBj V {(x, r)) = for 7r^-a.e. i 6 M and every r > 0, as desired. For q M , we 
simply observe that because the Dirichlet form corresponding to X M is a resistance form ([9j 
Proposition 2.1]), we can once again apply Corollary 14.31 to establish conditions (|l.ip - (|1.5p . 
Given these results, pointwise mixing time convergence follows from Theorem 13.51 

Theorem 5.6. Fix p e [l,oo]. Ift p mix {p N ) is the LP -mixing time of the random walk on C 
started from its root p N , then 

in distribution, where t p ^p) £ (0, oo) is the LP -mixing time of the Brownian motion on M 
started from p. 

Remark 5.7. As discussed in Remark 15. 4| we do not expect any major barriers in extending 
the above result to arbitrary starting points. The first task in doing this would be to adapt the 
convergence result proved in [9] regarding the convergence of simple random walks on subgraphs 
of C" formed of a finite number of line segments ([§J Lemma 6.4]) to arbitrary starting points. 
One could then extend this to obtain the desired convergence result for simple random walks 
on the entire space using ideas from [SJ Section 7] and heat kernel continuity. It would also 
be necessary to introduce a new Gromov-Hausdorff-type topology to state the result, as the 
one used in [9] is only suitable for the pointed case. Again, we suspect taking these steps will 
simply be a lengthy technical exercise, and choose not to follow them through here. We do 
though reasonably expect that t^LC ), the L p -mixing time of the random walk on C N , when 
rescaled appropriately, converges in distribution to t p m - x (M) G (0,oo), the LP -mixing time of 
the Brownian motion on A4, for any p 6 [1, oo]. 

5.4 Random walk on range of random walk in high dimensions 

Let S = (S n ) n >o be the simple random walk on Z d started from 0, built on an underlying 
probability space with probability measure P, and define the range of S up to time A to be the 
graph G N with vertex set 

V(G N ) := {S n : < n < N} , (5.7) 

and edge set 

E(G N ) :={{S n - U S n } : l<n< A}. (5.8) 

In this section, we will explain how to prove that if d > 5, which is an assumption henceforth, 
then the mixing times of the sequence of graphs (G n )n>\ grows asymptotically as cA 2 , P-a.s., 
where c is a deterministic constant. Since doing this primarily depends on making relatively 
simple adaptations of the high-dimensional scaling limit result of [12] for the random walk on 
the entire range of S (i.e. the N = oo case) to the finite length setting, we will be brief with 
the details. 

First, suppose that S = (5 n ) ng z is a two-sided extension of (5' ra ) n >o such that (5_ ra ) n >o is 
an independent copy of (S n ) n >o. The set of cut-times for this process, 

T := {n : S^^^ fl S , [ n+ i )00 ) = 0} , 

is known to be infinite P-a.s. (|17j). Thus we can write T = {T n : n £ Z}, where . . . T_i < Tq < 
< T\ < T2 < . . . . The corresponding set of cut-points is given by C := {C n : n £ Z}, where 
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C n := St„- For these objects, an ergodicity argument can be applied to obtain that, P-a.s., as 
\n\ — > oo, 

'" ■ r(d) := E(Ti|0 G T) G [l,oo), (5.9) 



rf G (o,c w ) 

n 



5(d) :=E(d G (0,Ci)|0Gr) G [l,oo), 



where do is the shortest path graph distance on the range G of the entire two-sided walk 
S, which is defined analogously to (I5.7P and (|5.8|) . In particular, see [121 Lemma 2.2], for a 
proof of the same convergence statements under the measure P(-|0 G T), and note that the 
conditioning can be removed by using the relationship between P and P(-|0 G T) described 
in [12|. Lemma 2.1]. Given these results, it is an elementary exercise to check that the metric 
space (V(G N ),T(d)5(d)~ 1 N^^-dQN), where d G N is the shortest path graph distance on G , 
converges P-a.s. with respect to the Gromov-Hausdorff distance to the interval [0, 1] equipped 
with the Euclidean metric. Moreover, the same ideas readily yield an extension of this result 
to a spectral Gromov-Hausdorff one including that ir N , the invariant measure of the associated 
simple random walk, converges to Lebesgue measure on [0, 1]. 

Now, for a fixed realization of G, let X = (X n ) n >o be the simple random walk on G started 
from 0. Define the hitting times by X of the set of cut-points C by Hq := min{m > : X m G C}, 
and, for n > 1, H n := min{m > H n ^\ : X m G C}. We use these times to define a useful indexing 
process Z = {Z n ) n >Q taking values in Z. In particular, if n < Hq, define Z n to be the unique 
k G Z such that Xh = Ck- Similarly, if n G [H m —i,H m ) for some m > 1, then define Z n to be 
the unique k G Z such that Xjj m = Ck- Noting that this definition precisely coincides with the 
definition of Z in [12], from Lemma 3.5 of that article we have that: for P-a.e. realization of G, 

(iV- 1 r(d)Z Ltiv2J )^ -> (B tK2(d) ) t >o, (5.10) 

in distribution, where (Bt)t>o is a standard Brownian motion on R started from 0, and n,2(d) G 
(0,oo) is the deterministic constant defined in [12]. To deduce from (|5,10p the following scaling 
limit for X N , the simple random walk on G N , we proceed via a time-change argument that is 
essentially a reworking of parts of [12] Section 3] . 

Lemma 5.8. For P-a.e. realization of S, if X N is started from 0, then 
(ridWdy'N-'do, (0,X^ 1N2tl )) i 

in distribution, where B^ 0,1 ^ = (i?| 0,1 ')f>o is Brownian motion on [0, 1] started at and reflected 
at the boundary. 

Proof. The following proof can be applied to any typical realization of S. To begin with, define 

Z N 

a process \A n ' ) n >o by setting 

n-1 

A Z,N ._ V 1 
m=0 

where T^ 1 := max{n : T n < N}. From (|5.9p . we have that T^ 1 ~ r(d)~ 1 N. Combining 
this observation with (|5.10p . one can check that, simultaneously with (|5.10p . (N~ 2 A^ tN2 ^)t>o 
converges in distribution to (^(d) -1 A^_ 2 ^)t>o, where 



B [o,i] 

' t>0 \ /t>0 



A t := / 1 {B s e[o,i]}ds 
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(cf. P21 Lemma 3.5]). 

We now apply the above result to establish a scaling limit for the process X observed on 
the vertex set V(G N ) := {S n : T\ < n < T^ 1 }. Specifically, set 

n-l 

A n := ^2 1 {X m ,X m+1 eV(G N )}- 
m=0 

Similarly to the proof of |12^ Lemma 3.6], one can check that 



sm) \A N - A Z ' N \ < V 1 

SUp \A m A m | v. ^ - L {Z mG [0,l,2]U[T- 1 -2,T- 1 -l,T- 1 ]}- 
— — m=0 



It is therefore a simple consequence of ()5.10p that N sup 0< 



m<TN 2 



AN _ A Z,N 



converges to 

in probability as N — > oo for any T G (0, oo). Since we know from equation (16) of |12] that 

N- 1 sup \d G (0,X m )-6(d)Z m \ 

0<m<TN 2 

also converges to in probability, we readily obtain 

M^iV-Wo,^ (5.11) 



t>0 \ ^w/OO 



in distribution, where X N = (X^) n >Q is the random walk X observed on V(G N ) - this is 
defined precisely by setting X^ := X a N^, where a (n) := max{A^ < n}. We remark that 
the particular limit process i?! 0,1 ] arises as a consequence of the fact that (B a s(A>o, where a B 
is the right-continuous inverse of A B , has exactly the distribution of B'- ' 1 '. 

Finally, since the process X N is identical in law to the simple random walk X N observed 
on V(G N ), to replace X N by X N in (|5.1ip it will suffice to check that X N spends only an 
asymptotically negligible amount of time in V(G N )\V(G N ). Since doing this requires only a 
simple adaptation of the proof of \12\ Lemma 3.8], we omit the details. To complete the proof, 
one then needs to replace do by d^N , but this is straightforward since 

iV- 1 sup \d G {0,S n )-d GN (0,S n )\ < n- 1 (t 1 +t t -x, 1 -t t -x) ^0, 

0<n<7V V N + N > 

as N — > oo. □ 



Although the previous lemma only contains a convergence statement for the random walks 
started from the particular vertex 0, there is no difficulty in extending this to the case when 
X N is started from a point x N G V(G N ) such that d GN (0,x N ) ~ T(d)- 1 5(d)Nx , and is 
started from xq G [0, 1]. Applying the local limit result of Proposition 12.41 (to establish (|2.11|) . 
we once again appeal to Lemma [23]), we are able deduce from this that Assumption [1] holds for 
P-a.e. realization of the original random walk. 

Lemma 5.9. For P-a.e. realization of S, if I C (0, oo) is a compact interval, then 
(V(G N ),r(d)5(d)-'N-'d GN ) ,tt n , {C( d )^NH^y)) xymGN) teI 

converges in (A4/,A/) to the triple consisting of: [0,1] equipped with the Euclidean metric, 
Lebesgue measure on this set and the transition density of Brownian motion on [0, 1] reflected 
at the boundary. 

Since it is clear that (|l.ip - (|1.5p hold in this case, we can therefore apply Theorem 11.41 to 
obtain the desired convergence of mixing times. 
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Theorem 5.10. Fix p G [l,oo]. //^^(S^tv]) is the LP -mixing time of the simple random walk 
on the range of S up to time N, then P-a.s., 

K 2 (d)N-H p mix (S [0iN] )^t p mix ([0,l]), 

where t^ ix ([0, 1]) is the L p -mixing time of the Brownian motion on [0, 1] reflected at the bound- 
ary. 

6 Mixing time tail estimates 

In this section, we give some sufficient conditions for deriving upper and lower estimates for 
mixing times of random walks on finite graphs, primarily using techniques adapted from |35j . 
We will also discuss how to apply these general estimates to concrete random graphs (see Section 
I6.3p . In order to crystallize the results and applications, most of the proofs shall be postponed 
to the appendix. 

As will be illustrated by our examples, the results in this section are robust and convenient 
for obtaining mixing time tail estimates. Moreover, when the convergence of mixing times (as in 
Theorem 1 1.4p is available for a sequence of graphs, we highlight how, by first deriving estimates 
for the relevant continuous mixing time distribution (where similar techniques are sometimes 
applicable, see Remark l6.3p . it can be possible to deduce results regarding the asymptotic tail 
behavior of random graph mixing times that are difficult to obtain directly (see the proof of 
Proposition 16.61 or Remark 16.91 for example). 

We start by fixing our notation. Let G = (V(G), E(G)) be a finite connected graph and \x 
be a weight function, as in the introduction. Suppose here that do is the shortest path metric 
on the graph G, and denote, for a distinguished vertex p G V(G), 

B(R) = {y : d G (p, y) < R}, V{R) := £ £ fx% = tt g (B (R)) p(G) , R G (0, oo), 

x£B(R) y-y~x 

where we write x ~ y if fi~} y > and set p(G) := ye v(G) Vxy ^ or ^ ne Markov chain X G , let 

TR = r B{PiR) = min{n > : X G B(R)}. 
We define a quadratic form £ by 

£(f,g) = i Y, ^ v (m-f(y))(9(x)-9(y)), 

x,y&V(G) 

and let H 2 = {/ S R^( G ) : £(f, f) < oo}. For disjoint subsets A, B of G, the effective resistance 
between them is then given by: 

R cS (A, B)- 1 = M{£(f, f) : / G H 2 , f\ A = 1, f\ B = 0}. (6.1) 

If we further define R e g(x,y) = R c s{{x},{y}), and R e g(x,x) = 0, then one can check that 
R e s (•, •) is a metric on V(G) (see [291 Section 2.3]). We will call this the resistance metric. The 
resistance metric enjoys the following important (but easy to deduce) estimate, 

I/O*) - f(y)\ 2 < R eS (x,y)£(f,f), V/ G L 2 (G, p G ). 

Moreover, it is easy to verify that if cj -1 := infx,j/6G:x~j/ Pxy > 0' then 

Res(x,y) < cidG{x,y) Wx,yeG. (6.2) 
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Let v, r : {0, l,--- , diam^ G (G) + 1} — > [0, oo) be strictly increasing functions with v(0) = 
r(0) = 0, v(l) = r(l) = 1, which satisfy 

for all < B! < R < diam dG (G) + 1, where d, C 2 > 1, 1 < di < d 2 and < a x < a 2 < 1. In 
what follows, v (•) will give the volume growth order and r(-) the resistance growth order. For 
convenience, we extend them to functions on [0, diam^ G (G) + 1] by linear interpolation. For the 
rest of the paper, C\, C 2 , di,d 2 and a\, a 2 stand for the constants given in (|6.3p . 

6.1 General upper and lower bounds 

In this subsection, we give general upper and lower bounds for mixing times. Note that, since 
Ci» < Cix(G0and Cix(G) < C X (G) for p < p>, it will be enough to estimate t~ ix (G) for 
the upper bound_J and imixl/ 9 ) ^ or ^ ne l° wer bound. 

Upper bound We first give an upper bound of the mixing times that is a reworking of [35, 
Corollary 4.2], in our setting. 

Lemma 6.1. For any weighted graph (G,fi G ), 

Cx(G) < 4dia mi? (G)^(G), 

where diam^(G) is the diameter of G with respect to the resistance metric i? e ff- 

Lower bound We next give the mixing time lower bound. Let A > 1, Hq, • • • , H% > 0, and let 
C 3 := 2- 2 / ai C 2 1/a2 where C 2 is the constant in (JO}. We give the following two conditions 
concerning the volume and resistance growth. 

Res{p,y) < X H °r(d G (p,y)), Vy G B(R), and V(R) < X Hl v(R), (6.4) 

R eS {p,B{R) c )> X~ H2 r(R) and V{C 3 X^ Ho+H ^ /ai R) > \- H *v{C 3 \-( H « +H2)/ai R). (6.5) 

Proposition 6.2. i) For X,R> 1, assume that n(G) > 4V(R), and that ([63D, ([63]) hold for 
R, then 

> C 4 \- H z- Hs v(R)r(R), (6.6) 

where H' 2 = H 2 + (H + H 2 )d 2 /a 1 . 

ii) For X,R>1, assume that n(G) > AV{R), and ()6.4p . (|6.5[) hold for R and Eq{X)R, where 
£q{X) := ciA -( ^ 0+ ^=o Hi+H 2 )/ai j or some Cl > small enough. Then 

4i» > C 4 X- H ^ Hs v(e (X)R)r(e (X)R). 

Remark 6.3. Essentially the same argument can be applied to deduce the corresponding mixing 
time upper and lower bounds in the continuous setting when we suppose that we have a process 
whose Dirichlet form is a resistance form. (Remark IA.1I contains details of the upper bound, 
and the details of the lower bound are omitted to avoid repetition.) 

1 ln fact, for the upper bound it is enough to estimate tJnix(G'). Indeed, the Cauchy-Schwarz inequality and 
(|3.5p implies the following known fact for mixing times of symmetric Markov chains; t^ ix (G) < 2t^ ix (G; 1/2), 
where £mix(G; 1/2) is the L 2 -mixing time of G with 1/2 instead of 1/4 in the definition (|1.8p . 
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6.2 Random graph case 

We now consider a probability space (SI, T , P) carrying a family of random weighted graphs 
G N (u) = (V(G n (oj)),E(G n (ui)),h n ^;lo G SI). We assume that, for each N G N and u G Si, 
G n (lo) is a finite, connected graph containing a marked vertex p N , and ffV(G N (uj)) < Mjv for 
some non-random constant Mjv < oo. (Here, for a set j4, is the number of elements in A.) 
Let d G N^(-, •) be a graph distance, -B(-R) := B UJ (p N , R), and V(i2) := V UJ (p N , R). We write 
X = (X n ,n > 0,P£,x G G (oj)) for the random walk on G N (u), and denote by Pn{x,y) its 
transition density with respect to it LD . Furthermore, we introduce a strictly increasing function 
h : Nu{0} — > [0, oo) with h(0) = 0, which will roughly describe the diameter of G N with respect 
to the graph distance. We then set 7Q = v(h(-)) ■ r(h(-)). Finally, for i = 1,2, we suppose 
Pi : [1, 00) — > [0, 1] are functions such that lim^oo Pi(X) = 0. We then have the following. (Note 
that C 2 ,ci 2 in the statement are the constant in (|6.3p .) 

Proposition 6.4. (1) Suppose that the following holds: 

P(diam i? (G 7V ) > Xr(h(N))) < Pl (A), P(p N (G N ) > Xv(h(N))) < p 2 (X), (6.7) 

then 

P(t^(G N ) > A 7 (iV)) < inf (pi(AV8) + P 2 (A 1 - e )). 

06[O,1] 

(02) Suppose there exist C\ < 1 and J > (1 + H\)/d2 such that the following holds: 

P(dSai) A ([63]) /or i? = Cl A- J /i(iV)) > 1 - Pl (A), P((i N (G N ) < X^viHN))) < p 2 (X), 

then there exist c 2 ,po > snc/i £/tai 

P(4ix(G JV ) < c 2 A-^ 7 (A r )) < 2pi(A) +p 2 (A/(4C lC f )). 

^5) Suppose there exist c\ < 1 and J > (1 + H\)jd 2 such that the following holds: 

P( dS3D A (IESJ) /or it! = ciA" J /i(iV) and /or e (A)i?) > 1 - pi(A), 
P^CO < A~ 1 w(/i(A r ))) < p 2 (A), (6.8) 

where £o(X) is as in Proposition ^. SM i). then there exist c 2 ,po > such that 

V(tlUp N ) < C2X- po 7 (N)) < 2 P1 (A) +p 2 (A/(4dcf )). 

To illustrate this result, we consider the case when the random graphs G N (u) are obtained 
as components of percolation processes on finite graphs, thereby recovering (35J, Theorem 1.2(c)]. 
(In [35], it was actually the lazy random walk was considered to avoid parity concerns, but the 
same techniques apply when we consider q^(-, •) as in (|1.7p instead.) 

Proposition 6.5. Let G N be a graph with N vertices and with the maximum degree d G [3, N — 
1] . Let C N be the largest component of the percolation subgraph of G N for < p < 1 . Let 
P < f or some fixed A £ 1, and assume that there exist c\,9\ G (0, 00) and K\ G N such 

that 

P(#C 7V < A^N 2 / 3 ) < cxA- e \ VA,N>K 1} (6.9) 
then there exist 02,62 G (0, 00) and K2 G N such that, for all p G [l,oo], 

V(A~ l N <t p mi J<Z N ) < AN) > l-c 2 A- 92 , ~iA,N>K 2 . (6.10) 



Finally, below is a list of exponents for each example in Section [5j 
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Section 




r(R) 


h(N) 


j(N) 


IO 


^logK/logL 


^>log A/ log L 


L N 


(K\) N 


EJwith a N = N 1/a , a G (1, 2] 




R 


TV 1 - 1 /" 


N 2-l/a 


IQl 


R 2 


R 




N 




R 


R 


N 


N 2 



Here the Euclidean distance is used instead of the intrinsic shortest path metric for the examples 
in Section Note that when a = 2 in Section [221 (the finite variance case), the growth of v(R) 
and r(R) is of the same order as in Section [5,31 The difference of scaling exponents of mixing 
times (namely j(N)) is due to the difference of scaling exponents for graph distances (namely 
h(N)). We also observe that the convergence to a stable law at (|5.5p forces the scaling constants 
to be of the form ajy = N 1 ^ a L(N) for some slowly varying function L (see \20\ Section 35]), and 
hence the above table captures all the most important first order behavior for the examples in 
Section 15.21 

6.3 Examples 

Critical Galton-Watson trees of Section 15.21 By combining the results in this section with our 
mixing time convergence result, we can establish asymptotic bounds for the distributions of 
mixing times of graphs in the sequence (Tn)n>i in the case when we have a finite variance 
offspring distribution. 

Proposition 6.6. In the case when the offspring distribution has finite variance, there exist 
constants ci, 02,03,04 G (0, 00) such that 

limsupP (V~ 3/2 tmix(7jv) > A) < c ie - C2X \ VA > 0, (6.11) 

and also 

limsupP 0v -3/2 4ix(PJv) < A" 1 ) < c 3 e- C4Al/25 , VA > 0. (6.12) 

Proof. To prove (|6.1ip . we apply the general mixing time upper bound of Lemma [6.1 1 to deduce 
that 

P (V^C^Tat) > A) < P (siV-^diam^CTTv) > a) , 

where diamd r (7jv) is the diameter of Tn with respect to dj- N , and we note that #E(Tn) is 
equal to 2(N — 1). By (|5.6p . the right-hand side here converges to P(8 diam^ (2) (T^) > A). 

By construction, the diameter of the continuum random tree is bounded above by twice 
the supremum of the Brownian excursion of length 1. We can thus use the known distribution 
of the latter random variable (see [25], for example) to deduce the relevant bound. 

For (|6.12p , we first apply the convergence in distribution of Theorem 15.31 to deduce that 

limsupP (iV- 3/2 t mix (/^) < A" 1 ) < P (t mix (p) < A" 1 ) . 

Now, for the continuum random tree, define 

J(A) = {r > : A-V 2 < ^ 2 \B Ti2) {p,r)) < Ar 2 , Rf \p,B TW {p,r) c ) > A _1 r}, 
where Rj-w is the resistance on the continuum random tree (see [11, (20)]). Then 

P(r G J(A)) > 1 - e~ cX , Vr G (0, \\, A > 1, 
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(see Lemmas 4.1 and 7.1]). As a consequence of this, we can apply the continuous version 
of the mixing time lower bound discussed in Remark 16.31 (with Hq = 0, Hi = Hi = H% = 1, 
H' 2 = 3, Qj = 1 and = 2) to deduce the desired result. □ 



Remark 6.7. The above proof already gives an estimate for the lower tail of t^ aiK (p). That the 
bound corresponding to (|6.11|) holds for the limiting tree, i.e. 

p(c x (r (2) )>A) < Cie - c ^ 2 , 

can be proved similarly to the discrete case (see Remark 16. 3|) . 

Critical Erdds-Renyi random graph of Section 15.31 Let C N be the largest component of the 
Erdos-Renyi random graph in the critical window. Then the following holds. 

Proposition 6.8. There exist constants ci, C2, C3, Nq, 9 £ (0, 00) such that 

sup P {N~H'^ ai {C N ) > A) < cie- C2X , VA>0, (6.13) 

N>1 

sup P (N-Hl lix (C N ) < X- 1 ) < c 3 \- e , VA > 0. (6.14) 

N>N 



Proof. By |35[ Proposition 1.4] and |36[ Theorem 1], (|6.13p is an application of Proposition [6741 
with pi(X) = C4e~ C5A3/2 and P2(A) = c§e~ C7XZ . ()6.14p is a consequence of Proposition 16.51 □ 



Remark 6.9. (1) The tail estimates for t l m ^{C N ) are given in [35, Theorem 1.1] without quan- 
titative bounds. (In fact, reading the paper very carefully, it can be checked that the bounds 
similar to Proposition 16.81 are available in the paper.) 

(2) It does not seem possible to apply current estimates for the graphs {C n )n>i and techniques 
for bounding mixing times to replace ^^(C^) by ^^(p^) in the latter estimate (see Remark 
IA.4j) . or even prove that the sequence (N/t^^p 1 ^))]^^ is tight, i.e. 

lim limsupP (A r_1 4ix(/0 Ar ) < A_1 ) = 0. 

A-i-oo N^oo 

That this final statement is nonetheless true is a simple consequence of Theorem 15.61 



A Appendix: Proof of the statements in Section [6] 

In this appendix, we prove various results given in Section [6) We adopt the convention that if 
we cite elsewhere the constant ci of Proposition IA.3I (for example), we denote it as t [A.3l i - 

A.l Proof of Lemma 16.11 and Proposition 16.21 

Proof of Lemma \6.1\ First, note that by [2, Proposition 3 in Chapter 2], we have that 

E - ( E Hxg^mKS} ) = t(*)E£(S), (A.l) 
\m=0 / 
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for any stopping time S with Xg = x. Taking S to be the first hitting time of x after time 
2m — 1, and writing Il(x, 2m) to represent the law of X^ m when X G is started from x, we obtain 
that 

2m— 1 Tit — 1 

Eg(., 2m )K) = Yl (P?( x > x ) ~ 1 )= 2 Y1 (<l2i( x > x ) - 1) > 2m fe G m (x,x) - 1) , 

Z=0 1=0 

where a x is the first hitting time of x, and the inequality holds because q^{x,x) is decreasing 
in / (see the proof of [21 Lemma 9], for example). Since by Cauchy-Schwarz, \q2 m (x, y) — 1| < 

[£jX, X) - l) 1 /2( (? G m (y ) y) _ 1 )l/2 ; ^ foUows that 

E G (<7, ) 

sup D^(x,2m) = sup |^(x, y) - l| < sup (<? G m (x, x) - 1) < sup * y ■ 

rrGV(G) x,yeV(G) zGV(G) x,y£V(G) lm 

By applying the commute time identity for random walks on graphs, E x (a y ) + Ey(a x ) = 
R e g(x,y){j,(G), this implies sup xe y( G ) D^ (x,2m) < diamR (£?)//(£?) /2m, and the result follows. 
□ 



Remark A.l. As mentioned in Remark 16.31 we can a PPly essentially the same argument to 
deduce the corresponding mixing time upper bound in the continuous setting when we suppose 
that we have a process whose Dirichlet form is a resistance form. In particular, suppose that 
this is the case for X F , as defined in the introduction. Let S be the first hitting time of x £ F 
after time t, then, for any / G L l (F, tt), 



b x U /(*«)&) = WfUn^iS), 



which can be obtained by applying an ergodicity argument similar to that used to prove (jA.ip . 
Writing H(x,t) to represent the law of Xf when X F is started from x, the expectation on the 
right-hand side here satisfies E a .(5) = t + ^n(x,t)( T x) < t + sup yeF R e g(x,y), where to deduce 
the upper bound, we have applied that the commute time identity E x (t v ) + ~E v (t x ) = R c s(x,y) 
also holds for resistance forms (since we are assuming tt to be a probability measure, it does not 
appear explicitly in this version of the identity). Moreover, if / is positive, the left-hand side 
is bounded below as follows: E x (f Q f(X s )ds) > J Q * f F q s (x,y)f(y)7r(dy)ds. Combining these 
bounds, we have proved that, for positive / G L 1 (i ? , 7r) such that ||/|| J Li( 7r ) ^ 0, 

& $ F <ls{x,y)f(y)n{d y )ds 

-V—z — < t + diam R (F). 

I|7||li(tt) 

By choosing a sequence of suitable functions whose support converges to {x}, the joint continuity 
of {qt(%,y))x,y£F,t>o allows us to deduce from this that 

tqt(x,x) < / q s (x, x)ds < t + diam^(i ? ), 
Jo 

where the first inequality holds because qt(x,x) is decreasing in t. The remainder of the proof 
is identical to the graph case. 

The proof of Proposition 16.21 requires some preparations. Our argument depends on some 
estimates for hitting times that are modifications of results in [3j . 
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To begin with, let B = B(R) and define 



9b(x, y) = Hy 1 ^ P x( x k = V, k < t b ). 

k=0 

Then, it is easy to show that 

E ? r £ = ^29B{z,y)^ y , R eS (x,B c ) = g B (x,x) 
y eB 

(see, for example (2. 19), (2. 20)]). Also, if A and B are disjoint subsets of G and x ^ AL) B, 
then (see (2.14)]) 

p ^<^ £ tffef' (A - 2) 

where T A is the hitting time of A C G. If C 4 := S^C^C^ 2 , we can then prove the following. 
(Here, recall that C 3 = 2- 2 / ai C 2 1/a2 and C 1 ,C 2 are the constants in (JO}.) 

Lemma A. 2. Lei A > 1 and Hq, • • • , H% > 0. 
foj Suppose (|6.4p holds. Then 

E^r/j, < 2X Ho+Hl v(R)r(R) for x G £(i?). (A.3) 

foJ Suppose (j6.4|) and ()6.5[) /ioW. TTien 

E^r K > 2C±\- H '*~ Hs v{R)r{R) for x G B(C 3 \~ {Ho+H2)/ai R), (A.4) 

where we recall H' 2 = H 2 + (i?o + H 2 )d 2 / a\. 

(c) Suppose dS3]) and (|fD)|) . and Zei x G B(C z \~^ Ho+H2 ^ ai R), then 

nC , , 2C 4 \- H z- H3 v(R)r(R) - n „ /A . 



Proo/. Using (^2]) . we have # eff (z, 5 C ) < i? eff (0, 2) + i? cfT (0, 5 C ) < 2X Ho r(R) for any z £ B. So, 
E ? r B = I>b(*,2/K < J>b(^K = i? eff (^B c )F(i?) < 2A^ +Hl ? ;( J R)r( J R), 

which gives ()A.3p . In order to prove (|A.4p . we first establish the following: for < e < 

l/(2C 2 A Ho+H2 ) 1 / ai = 2 1 /°iC 3 A-(^o+B 2 )/a 1 and y G B (eR), we have 

E?(T, < r*) > 1 - l _ 2 C2 ^ xHa+H , > 1 " 2C^A*°+*". (A.6) 



Indeed, by the first inequalities of (|6.4p and (|6.5p . we have 

R eff (y,B(R) c ) > R eff (p,B(R) c ) - R cS (p,y) > \- H2 r{R) - \ H °r{eR) > ~ \ H °r{eR). 

So, by HQ, 

G -Rcff(y.p) A H °r(gfi) C 2 e ai X Ho+H2 

y {TR< p) - R CS (V, B(RY) - ^M) _ X ^R) - 1 - C 2 e°*\Xo+H2 ' 
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and (TO]) is obtained. Now, if y 6 B' = B(C 3 \-( Ho+H2 ^ ai R), then the bound at COl) gives 
that P G (T p < t b ) > \, so 

5B (p,y) = g B (p,p)P°(T p < t b ) > y B (p,p) = lRcs(p,B c ) > \\~ H2 r{R). 
By the second inequality of (|6.5p . we have 

p(B') > \- H3 v{C 3 \- iHQ+H2)/ai R) > C^ 1 ^ 2 \~ 2 ( Ho+H2 ^ d2 / ai ~ H3 v(R), 
and therefore we obtain, 

E^t b > 9Es(p,y)p y 
y eB> 

> y B (p,p)p(B') 

> ic^c^x-^-^+^^/^-^viRyiR) 

= AC A \- H ' 2 - H H{R)r{R). 

Moreover, for x <E B' we have that B g t b > P G (T p < r B )E^r B , which gives (|A.4|) . 
Finally, by the Markov property, (|A.3p and (|A.4|) . 

2C4\~ H 2~ Ha v(R)r(R) < E^tr < n + E^[l {7S>n} E^(r fl )] 

< n + 2X Ho+Hl v(R)r(R)P G (T R > n). 

Rearranging this gives ()A.5j) . □ 



The following estimate is a modification of |32(, Proposition 3.5 (a)] (see [21 (2.4)] for the 
important special case v(R) = R 2 , r(R) = R). Note that for R > diam c ; G (G), it is the case that 
tb = oo, and so (|A.7j) trivially holds. 

Proposition A. 3. Let < e < C 3 \~( H ° +H2 ^ ai , and suppose flO) and JB3]) for R and sR, 

then 

P G (tr < Ci\- H ' 2 - Hs v(eR)r{eR)) < Cl A H °+£<=° H *+ H 2 £ <*i , / or y G B{eR). (A.7) 



Proof. We take a kind of bootstrap from (|A.5[> and (|A.6[) . Let to > 0, and set 

q(y) = P G (t r <T p ), a(y) = P g (t r < to). 

Then 

a(y) = P g (t r < t ) = P G (tr < t ,T R <T p ) + P g (t r < to, tr > T p ) 

< P g (t r < T p ) + P G (T p < tr, tr-T p < t ) 

< q(y) + (1 - q(y))a(p) < q(y) + a(p), (A.8) 

using the strong Markov property for the second inequality. Starting the Markov chain X at p, 
we have 

a{p) = P g (t r < to) < E^[l {T£fl < to} Pf T (tr < to)} < P G {t £ r < t ) max a{y). (A.9) 
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Combining (|A.8|) and (|A.9() gives 



maXyepB^) g(y) 
a(p) < G , — . ■ (A.10) 



Further, using (|A.6[) with 2e, we have 



Let t = C 4 X~ H ^~ H:i v(eR)r(eR); then using (TAToT) for the ball £(ei?) (note that ([63]) and ([63]) 
for ei? are assumed here), we obtain 

P p G (r £R >i )>coA-^-^. 
combining this with (jA.llj) . (jA.10[) and (|A.8[> completes the proof of (|A.7[) . □ 



Note that, we may and will take < JA.3l i ^ l/X^Cg 1 ). Now we are ready to prove Proposition 

IBT21 

Proof of Proposition 1 6. "A i) We follow the argument in |35[ Lemma 5.4]. Let t G N. If 
P^(tb < i) > 1/2 for all x G B(R — 1), then T#/i is stochastically dominated by a geometric 
random variable with parameter 1/2, so that E"^[tr] < 2t. By this and (|A.4p . we see that 
for t = C A \- H 'i- Hz v{R)r{R), there exists some x G B(R - 1) such that P^(r B < t) < 1/2. 
Further, since fi(G) > W(R), n(B(R)) = V(B(R))/fj,(G) < 1/4. Combining these observations, 
we obtain 

D 1 (x,t) > 2P^ (t r >t)- 2n(B(R)) > 1 - I > I (A.12) 

so that ()6.6p follows. 

ii) Take e = £o(A) in Proposition IA.3I and let t = Ci\~ H ^~ HA v(eR)r(eR). Then, since 
< e < C^ Ho+H ^/ ai (this is because we take c ^Ol i > 1 /( 2C 3 1 )), by (TOT) we have (tr < 
t) < t [A.3l i / ^ g ° + ^'' =0 Hi+H 2e ai = 1/2. The rest is the same as the proof of i) except that we 
take x = p in ([ATT2|) and take cjg^h = (2qAT3li)~ 1/ai • D 



A. 2 Proof of Proposition 16.41 and Proposition 16.51 

Proof of Proposition \6.4\ By Lemma 16. 11 we have for any 9 G [0, 1] that 

P &(0 > A 7 (A0) < P (8diam i? (G 7V ) / i 7V (G JV ) > A 7 (AT)) 

< P (sdiam^G^) > A 9 r(M^))) + P U N {G N ) > A^V^JV))) 

< p 1 (A e /8)+p 2 (A 1 - 9 ), 

which implies the conclusion of (1). 

For (2), let R = a\- J h(N) and define 

t ■= C±\- H 'i- Hs v{R)r{R) = C i X- H *- H3 v{cx\- J h(N))r(ciX- J h(N)) 
> C^-^-^C^C^idX^f^vihiN^rihiN)) =: c 2 \- po j(N). 
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Then by Proposition I6.2l i). 

P(tl iK (G N ) < Cl A-« 7 (iV)) < P(J^ dx (G N ) < t) 

< P(either or do not hold for R = Cl \- J h(N)) + P(^ N (G N ) < 4V(R)) 

< Pl (X)+P( f , N (G N )<4V(R)). 

Note that 

4\ Hl v(R) = 4\ Hl v( Cl \- J h(N)) < 4A Hl Ci(ciA^ J ) d2 v(/i(iV)) < AC x cf X^v(h(N)), 
where we used J > (1 + H\)/d,2 in the last inequality. Using this, we have 

P(ji N (G N ) < 4V(R)) 

< P(n N (G N ) < AX Hl v(R)) + P{X Hl v{R) < V(R)) 

< P(fi N (G N ) < AC lC f\- l v(h{N))) + Pl (X) 

< P2(A/(4Cic*))+pi(A), 

which implies the conclusion of (2). The proof of (3) is almost the same, so we omit it. □ 

Proof of Proposition \6.5[ We only indicate how to apply previous propositions. First, 
the upper bound of t p niK (C N ) can be obtained by Proposition 16.41 (1) with v(R) = R 2 ,r(R) = 
R,h(N) = iV 1 / 3 and p x {A) = c A- qo ,p 2 (A) = c'qA~< for some c ,c' ,q ,q > 0. Indeed, (IHTTD 
holds because of [33 Theorem 2.1 (a),(b), Theorem 6.1] and the fact diam^^) > diamR (C N ), 
which is due to (|6.2p . 

The lower bound is more complicated. Using Proposition 5.5-5.7 and (5.1) in [35J with 

P = \- 1 /\L = X H \a = X Hl ,r = R,h = C 3 X~ H2 R,m = X~ H3 (C 3 X- H2 R) 2 , 

and then taking R = ciA^iV 1 / 3 , H = (due to flOD), H x = H 2 = 2, H 3 = 4, J = (l+i?i)/2 = 
3/2, we see that for each v G G N , 

P(#C(v) > A~ 1/4 iV 2/3 and A) < c 4 A- 1/2 ^- 1/3 , 

where 

A = {V(v,C 3 X- 2 R) < X-\C 3 X- 2 R) 2 , R eS (v,B(v,R) c ) < #E(B(v,R)) > X 2 R 2 }. 

This corresponds to [351 (5.3)]. Now using Proposition I6.2l i) and arguing similarly to the proof 
of [35, Theorem 2.1 (c.2)], we have 

P{3v G G N with #C{v) > A~ 1/4 iV 2/3 and 4lx(C(«)) < C 4 X~ 29/2 N) < c 4 A~ 1/4 . 

This together with (|6.9p implies the desired lower bound of ^^(C^)- □ 

The proofs of this proposition and Proposition l6.6l highlight why it is useful to have a general 
theory where the exponents Hq, • • • , H 3 can vary. 

Remark A. 4. As mentioned in Remark 16.91 (2). it does not seem possible to apply cur- 
rent estimates for the graphs (C n )n>i and techniques for bounding mixing times to replace 
A^N < t p miK (C N ) by A~ X N < t p mix (p N ) in (gJQ) . The major difficulty is to verify the first 
inequality of (|6.8|) for eq(X)R. Indeed, even if we choose Hq, ■ ■ ■ ,H 3 large (which increases the 
chance that (|6.4p and (|6.5p hold for R), Eq(X) gets small accordingly, so that the probability 
P(([Oj) A ([63D for Eq(X)R) does not increase. 



36 



References 

[1] L. Addario-Berry, N. Broutin, and C. Goldschmidt, The continuum limit of critical random 
graphs, Probab. Theory Related Fields, to appear. 

[2] D. Aldous and J. Fill, Reversible Markov chains and random walks on graphs, Preprint 
http : //www . stat . berkeley . edu/~aldous/RWG/book . html 

[3] M.T. Barlow, A. A. Jarai, T. Kumagai and G. Slade, Random walk on the incipient infinite 
cluster for oriented percolation in high dimensions, Comm. Math. Phys. 278 (2008), 385- 
431. 

[4] I. Benjamini, G. Kozma and N. Wormald, The mixing time of the giant component of a 
random graph, preprint. 

[5] P. Berard, G. Besson and S. Gallot, Embedding Riemannian manifolds by their heat kernel, 
Geom. Funct. Anal. 4 (1994), 373-398. 

[6] C. Borgs, J.T. Chayes, R. van der Hofstad, G. Slade and J. Spencer, Random subgraphs 
of finite graphs: I. The scaling window under the triangle condition, Random Structures 
Algorithms, 27 137-184, 2005. 

[7] D. Burago, Y. Burago, and S. Ivanov, A course in metric geometry, Graduate Studies in 
Mathematics, vol. 33, American Mathematical Society, Providence, RI, 2001. 

[8] G.-Y. Chen and L. Saloff-Coste, The cutoff phenomenon for ergodic Markov processes, 
Electron. J. Probab. 13 (2008), 26-78. 

[9] D. A. Croydon, Scaling limit for the random walk on the largest connected component of 
the critical random graph, Publ. RIMS. Kyoto Univ., to appear. 

[10] , Convergence of simple random walks on random discrete trees to Brownian motion 

on the continuum random tree, Ann. Inst. Henri Poincare Probab. Stat. 44 (2008), 987- 
1019. 

[11] , Volume growth and heat kernel estimates for the continuum random tree, Probab. 

Theory Related Fields 140 (2008), 207-238. 

[12] , Random walk on the range of random walk, J. Stat. Phys. 136 (2009), 349-372. 

[13] , Scaling limits for simple random walks on random ordered graph trees, Adv. in 

Appl. Probab. 42 (2010), 528-558. 

[14] D. A. Croydon and B. M. Hambly, Local limit theorems for sequences of simple random 
walks on graphs, Potential Anal. 29 (2008), 351-389. 

[15] T. Duquesne, A limit theorem for the contour process of conditioned Galton-Watson trees, 
Ann. Probab. 31 (2003), 996-1027. 

[16] T. Duquesne and J.-F. Le Gall, Probabilistic and fractal aspects of Levy trees, Probab. 
Theory Related Fields 131 (2005), 553-603. 

[17] P. Erdos and S. J. Taylor, Some intersection properties of random walk paths, Acta Math. 
Acad. Sci. Hungar. 11 (1960), 231-248. 



37 



[18] N. Fountoulakis and B.A. Reed, The evolution of the mixing rate of a simple random walk 
on the giant component of a random graph, Random Structures Algorithms, 33 (2008), 
68-86. 

[19] M. Fukushima, Y. Oshima and M. Takeda, Dirichlet forms and symmetric Markov pro- 
cesses, de Gruyter Studies in Mathematics, 19. Walter de Gruyter & Co., Berlin, 2011. 

[20] B. V. Gnedenko and A. N. Kolmogorov, Limit distributions for sums of independent random 
variables, Addison- Wesley Publishing Company, Inc., Cambridge, Mass., 1954, Translated 
and annotated by K. L. Chung. With an Appendix by J. L. Doob. 

[21] S. Goel, R. Montenegro, and P. Tetali, Mixing time bounds via the spectral profile, Electron. 
J. Probab. 11 (2006), 1-26. 

[22] A. Greven, P. Pfaffelhuber, and A. Winter, Convergence in distribution of random metric 
measure spaces (A-coalescent measure trees), Probab. Theory Related Fields 145 (2009), 
285-322. 

[23] M. Heydenreich and R. van der Hofstad, Random graph asymptotics on high- dimensional 
tori II. Volume, diameter and mixing time, Probab. Theory Related Fields, 149 (2011), 
397-415. 

[24] A. Kasue and H. Kumura, Spectral convergence of Riemannian manifolds, Tohoku Math. 
J. 46 (1994), 147-179. 

[25] D. P. Kennedy, The distribution of the maximum Brownian excursion, J. Appl. Probab. 
13 (1976), 371-376. 

[26] J. Kigami, Resistance forms, quasisymmetric maps and heat kernel estimates, Memoirs 
AMS, to appear. 

[27] J. Kigami, Hausdorff dimensions of self-similar sets and shortest path metrics, J. Math. 
Soc. Japan 47 (1995), 381-404. 

[28] J. Kigami, Harmonic calculus on limits of networks and its application to dendrites, J. 
Funct. Anal. 128 (1995), 48-86. 

[29] J. Kigami, Analysis on fractals, Cambridge Tracts in Mathematics, vol. 143, Cambridge 
University Press, Cambridge, 2001. 

[30] T. Kumagai, Homogenization on finitely ramified fractals, Stochastic analysis and related 
topics in Kyoto, Adv. Stud. Pure Math., vol. 41, Math. Soc. Japan, Tokyo, 2004, pp. 189- 
207. 

[31] T. Kumagai and S. Kusuoka, Homogenization on nested fractals, Probab. Theory Related 
Fields 104 (1996), 375-398. 

[32] T. Kumagai and J. Misumi, Heat kernel estimates for strongly recurrent random walk on 
random media, J. Theoret. Probab. 21 (2008), 910-935. 

[33] J.-F. Le Gall, Random real trees, Ann. Fac. Sci. Toulouse Math. (6) 15 (2006), 35-62. 

[34] D. Levin, Y. Peres and E. Wilmer, Markov chains and mixing times, Amer. Math. Soc, 
Providence, RI, 2009. 



38 



[35] A. Nachmias and Y. Peres, Critical random graphs: diameter and mixing time, Ann. 
Probab. 36 (2008), 1267-1286. 

[36] A. Nachmias and Y. Peres, The critical random graph, with martingales, Israel J. Math. 
176 (2010), 29-41. 



39 



